The Data Science Pilot Action Set

The dataSciencePilot action set consists of actions that implement a policy-based, configurable, and scalable approach to automating data science workflows. This action set can be used to automate an end-to-end workflow or to automate steps in the workflow such as data preparation, feature preprocessing, feature engineering, feature selection, and hyperparameter tuning. More information about this action set is available on its documentation page.


Table of Contents

Today we will set up the notebook and go through each of the seven actions.

Setting Up the Notebook

First, we must import the Scripting Wrapper for Analytics Transfer (SWAT) package and use the package to connect to out Cloud Analytics Service (CAS).


In [1]:
import swat
import numpy as np
import pandas as pd

In [2]:
conn = swat.CAS('localhost', 5570, authinfo='~/.authinfo', caslib="CASUSER")

Now we will load the dataSciencePilot action set and the decisionTree action set.


In [3]:
conn.builtins.loadactionset('dataSciencePilot')
conn.builtins.loadactionset('decisionTree')


NOTE: Added action set 'dataSciencePilot'.
NOTE: Added action set 'decisionTree'.
Out[3]:
§ actionset
decisionTree

elapsed 0.00298s · user 0.000946s · sys 0.00198s · mem 0.22MB

Next, we must connect to our data source. We are using a data set for predicting home equity loan defaults.


In [4]:
tbl = 'hmeq'
hmeq = conn.read_csv("./data/hmeq.csv", casout=dict(name=tbl, replace=True))


NOTE: Cloud Analytic Services made the uploaded file available as table HMEQ in caslib CASUSER(sasdemo).
NOTE: The table HMEQ has been created in caslib CASUSER(sasdemo) from binary data uploaded to Cloud Analytic Services.

In [5]:
hmeq.head()


Out[5]:
Selected Rows from Table HMEQ
BAD LOAN MORTDUE VALUE REASON JOB YOJ DEROG DELINQ CLAGE NINQ CLNO DEBTINC
0 1.0 1100.0 25860.0 39025.0 HomeImp Other 10.5 0.0 0.0 94.366667 1.0 9.0 NaN
1 1.0 1300.0 70053.0 68400.0 HomeImp Other 7.0 0.0 2.0 121.833333 0.0 14.0 NaN
2 1.0 1500.0 13500.0 16700.0 HomeImp Other 4.0 0.0 0.0 149.466667 1.0 10.0 NaN
3 1.0 1500.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 0.0 1700.0 97800.0 112000.0 HomeImp Office 3.0 0.0 0.0 93.333333 0.0 14.0 NaN

Our target is “BAD” meaning that it was a bad loan. I am setting up a variable to hold our target information as well as our policy information. Each policy is applicable to specific actions and I will provide more information about each policy later in the notebook.


In [6]:
# Target Name 
trt='BAD'
# Exploration Policy 
expo = {'cardinality': {'lowMediumCutoff':40}}
# Screen Policy 
scpo = {'missingPercentThreshold':35}
# Selection Policy 
sepo = {'criterion': 'SU', 'topk':4}
# Transformation Policy 
trpo = {'entropy': True, 'iqv': True, 'kurtosis': True, 'outlier': True}

Explore Data

The exploreData action calculates various statistical measures for each column in your data set such as Minimum, Maximum, Mean, Median, Mode, Number Missing, Standard Deviation, and more. The exploreData action also creates a hierarchical variable grouping with two levels. The first level groups variables according to their data type (interval, nominal, data, time, or datetime). The second level uses the following statistical metrics to group the interval and nominal data:

  • Missing rate (interval and nominal).
  • Cardinality (nominal).
  • Entropy (nominal).
  • Index of Qualitative Variation(IQV; interval and nominal).
  • Skewness (interval).
  • Kurtosis (interval).
  • Outliers (interval).
  • Coefficient of Variation (CV; interval).

This action returns a CAS table listing all the variables, the variable groupings, and the summary statistics. These groupings allow for a pipelined approach to data transformation and cleaning.


In [7]:
conn.dataSciencePilot.exploreData(   
        table  = tbl,
        target = trt,     
        casOut = {'name': 'EXPLORE_DATA_OUT_PY', 'replace' : True},
        explorationPolicy = expo
    )
conn.fetch(table = {'name': 'EXPLORE_DATA_OUT_PY'})


Out[7]:
§ Fetch
Selected Rows from Table EXPLORE_DATA_OUT_PY
Variable VarType MissingRated CardinalityRated EntropyRated IQVRated CVRated SkewnessRated KurtosisRated OutlierRated ... MomentCVPer RobustCVPer MomentSkewness RobustSkewness MomentKurtosis RobustKurtosis LowerOutlierMomentPer UpperOutlierMomentPer LowerOutlierRobustPer UpperOutlierRobustPer
0 BAD binary-target NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 REASON character-nominal 1.0 1.0 3.0 NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 JOB character-nominal 1.0 1.0 3.0 3.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 LOAN numeric-nominal 1.0 3.0 NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 MORTDUE interval 2.0 NaN NaN NaN 3.0 1.0 2.0 3.0 ... 60.272664 69.553515 1.814481 0.844221 6.481866 0.370274 0.000000 2.958471 2.241823 1.727306
5 VALUE interval 1.0 NaN NaN NaN 3.0 1.0 3.0 3.0 ... 56.384362 60.247883 3.053344 0.989755 24.362805 0.425793 0.000000 2.479480 0.444596 2.599179
6 YOJ interval 2.0 NaN NaN NaN 3.0 1.0 1.0 2.0 ... 84.888530 142.857143 0.988460 0.977944 0.372072 -0.006105 0.000000 2.314050 0.000000 0.055096
7 DEROG numeric-nominal 2.0 1.0 2.0 1.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
8 DELINQ numeric-nominal 2.0 1.0 2.0 1.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
9 CLAGE interval 2.0 NaN NaN NaN 3.0 1.0 2.0 2.0 ... 47.734255 67.143526 1.343412 0.282945 7.599549 0.061058 0.000000 1.150035 0.000000 0.902335
10 NINQ numeric-nominal 2.0 1.0 2.0 3.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
11 CLNO numeric-nominal 1.0 2.0 3.0 3.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
12 DEBTINC interval 2.0 NaN NaN NaN 3.0 1.0 3.0 2.0 ... 25.464084 28.327403 2.852353 -0.524787 50.504042 0.495258 0.831025 0.703175 0.575325 1.363733

13 rows × 42 columns

elapsed 0.00287s · user 0.00283s · mem 0.996MB


Explore Correlations

If a target is specified, the exploreCorrelation action performs a linear and nonlinear correlation analysis of the input variables and the target. If a target is not specified, the exploreCorrelation action performs a linear and nonlinear correlation analysis for all pairwise combinations of the input variables. The correlation statistics available depend on the data type of each input variable in the pair.

  • Nominal-nominal correlation pairs have the following statistics available: Mutual Information (MI), Symmetric Uncertainty (SU), Information Value (IV; for binary target), Entropy, chi-square, G test (G2), and Cramer’s V.
  • Nominal-interval correlation pairs have the following statistics available: Mutual Information (MI), Symmetric Uncertainty (SU), Entropy, and F-test.
  • Interval-interval correlation pairs have the following statistics available: Mutual Information (MI), Symmetric Uncertainty (SU), Entropy, and Pearson correlation.

This action returns a CAS table listing all the variable pairs and the correlation statistics.


In [8]:
conn.dataSciencePilot.exploreCorrelation(
        table = tbl, 
        casOut = {'name':'CORR_PY', 'replace':True},
        target = trt
)
conn.fetch(table = {"name" : "CORR_PY"})


Out[8]:
§ Fetch
Selected Rows from Table CORR_PY
FirstVariable SecondVariable Type MI
0 CLAGE BAD _it_ 0.030242
1 CLNO BAD _it_ 0.015505
2 DEBTINC BAD _it_ 0.063485
3 DELINQ BAD _it_ 0.076942
4 DEROG BAD _it_ 0.048241
5 LOAN BAD _it_ 0.036787
6 MORTDUE BAD _it_ 0.012855
7 NINQ BAD _it_ 0.021363
8 VALUE BAD _it_ 0.016458
9 YOJ BAD _it_ 0.009881
10 JOB BAD _nt_2 0.010523
11 REASON BAD _nt_1 0.001027

elapsed 0.00167s · user 0.00142s · sys 0.000147s · mem 0.953MB


Analyze Missing Patterns

If the target is specified, the analyzeMissingPatterns action performs a missing pattern analysis of the input variables and the target. If a target is not specified, the analyzeMissingPatterns action performs a missing pattern analysis for all pairwise combinations of the input variables. This analysis provides the correlation strength between missing patterns across variable pairs and dependencies of missingness in one variable and the values of the other variable. This action returns a CAS table listing all the missing variable pairs and the statistics around missingness.


In [9]:
conn.dataSciencePilot.analyzeMissingPatterns(
        table = tbl, 
        target = trt, 
        casOut = {'name':'MISS_PATTERN_PY', 'replace':True}
)
conn.fetch(table = {'name': 'MISS_PATTERN_PY'})


Out[9]:
§ Fetch
Selected Rows from Table MISS_PATTERN_PY
FirstVariable SecondVariable Type MI NormMI SU EntropyPerChange
0 CLAGE BAD _mt_ 0.000672 0.036636 0.001324 0.093150
1 CLNO BAD _mt_ 0.000258 0.022695 0.000542 0.035732
2 DEBTINC BAD _mt_ 0.184595 0.555613 0.251610 25.605476
3 DELINQ BAD _mt_ 0.003061 0.078129 0.005183 0.424657
4 DEROG BAD _mt_ 0.003954 0.088750 0.006342 0.548446
5 LOAN BAD _mt_ 0.000000 0.000000 0.000000 0.000000
6 MORTDUE BAD _mt_ 0.000011 0.004749 0.000020 0.001564
7 NINQ BAD _mt_ 0.001243 0.049837 0.002177 0.172475
8 VALUE BAD _mt_ 0.035911 0.263255 0.083951 4.981264
9 YOJ BAD _mt_ 0.002535 0.071110 0.004426 0.351600
10 JOB BAD _mt_ 0.003678 0.085615 0.007404 0.510240
11 REASON BAD _mt_ 0.000016 0.005728 0.000034 0.002276

elapsed 0.00171s · user 0.00165s · mem 0.964MB


Detect Interactions

The detectInteractions action will assess the interactions between pairs of predictor variables and the correlation of that interaction on the response variable. Specially, it will see if the product of the pair of predictor variables correlate with the response variable. Since checking the correlation between the product of every predictor pair and the response variable can be computationally intensive, this action relies on the XYZ algorithm to search for these interactions efficiently in a high-dimensional space.

The detectInteractions Action requires that all predictor variables be in a binary format, but the response variable can be numeric, binary, or multi-class. Additionally, the detectInteractions Action can handle data in a sparse format, such as when predictor variables are encoded using an one-hot-encoding scheme. In the example below, we will specify that our inputs are sparse. The output tables shows the gamma value for each pair of variables.


In [10]:
# Tranform data for binary format
conn.dataPreprocess.transform(
    table = hmeq, 
    copyVars = ["BAD"], 
    casOut = {"name": "hmeq_transform", "replace": True}, 
    requestPackages = [{"inputs":["JOB", "REASON"], 
                        "catTrans":{"method": "label", "arguments":{"overrides":{"binMissing": True}}}}, 
                      {"inputs":["MORTDUE", "DEBTINC", "LOAN"], 
                       "discretize": {"method": "quantile", "arguments":{"overrides":{"binMissing": True}}} }])
conn.fetch(table = {'name': 'hmeq_transform'})


Out[10]:
§ Fetch
Selected Rows from Table HMEQ_TRANSFORM
BAD _TR2_DEBTINC _TR2_LOAN _TR2_MORTDUE _TR1_JOB _TR1_REASON
0 1.0 0.0 1.0 1.0 3.0 2.0
1 1.0 0.0 1.0 3.0 3.0 2.0
2 1.0 0.0 1.0 1.0 3.0 2.0
3 1.0 0.0 1.0 0.0 0.0 0.0
4 0.0 0.0 1.0 4.0 2.0 2.0
5 1.0 4.0 1.0 1.0 3.0 2.0
6 1.0 0.0 1.0 2.0 3.0 2.0
7 1.0 4.0 1.0 1.0 3.0 2.0
8 1.0 0.0 1.0 1.0 3.0 2.0
9 1.0 0.0 1.0 0.0 5.0 2.0
10 1.0 0.0 1.0 1.0 0.0 0.0
11 1.0 0.0 1.0 1.0 2.0 2.0
12 1.0 0.0 1.0 2.0 3.0 2.0
13 0.0 0.0 1.0 3.0 1.0 0.0
14 1.0 0.0 1.0 3.0 3.0 2.0
15 1.0 0.0 1.0 1.0 3.0 2.0
16 1.0 0.0 1.0 4.0 1.0 2.0
17 1.0 1.0 1.0 1.0 0.0 0.0
18 1.0 0.0 1.0 1.0 3.0 2.0
19 0.0 2.0 1.0 5.0 2.0 2.0

elapsed 0.00171s · user 0.00149s · sys 0.000155s · mem 0.96MB


In [11]:
conn.dataSciencePilot.detectInteractions(
    table ='hmeq_transform', 
    target = trt, 
    event = '1', 
    sparse = True, 
    inputs = ["_TR1_JOB", "_TR1_REASON", "_TR2_MORTDUE", "_TR2_DEBTINC", "_TR2_LOAN"], 
    inputLevels = [7, 3, 6, 6, 6], 
    casOut = {'name': 'DETECT_INT_OUT_PY', 'replace': True})
conn.fetch(table={'name':'DETECT_INT_OUT_PY'})


WARNING: Input values should be integers starting with one when sparseInputs is True.
Out[11]:
§ Fetch
Selected Rows from Table DETECT_INT_OUT_PY
FirstVarID FirstVarName SecondVarID SecondVarName Gamma
0 7.0 _TR1_JOB_7 12.0 _TR2_MORTDUE_2 0.502352
1 10.0 _TR1_REASON_3 12.0 _TR2_MORTDUE_2 0.502352
2 22.0 _TR2_DEBTINC_6 12.0 _TR2_MORTDUE_2 0.502352
3 28.0 _TR2_LOAN_6 12.0 _TR2_MORTDUE_2 0.502352
4 5.0 _TR1_JOB_5 12.0 _TR2_MORTDUE_2 0.481060
5 6.0 _TR1_JOB_6 12.0 _TR2_MORTDUE_2 0.463729
6 7.0 _TR1_JOB_7 14.0 _TR2_MORTDUE_4 0.452340
7 10.0 _TR1_REASON_3 14.0 _TR2_MORTDUE_4 0.452340
8 22.0 _TR2_DEBTINC_6 14.0 _TR2_MORTDUE_4 0.452340
9 28.0 _TR2_LOAN_6 14.0 _TR2_MORTDUE_4 0.452340
10 5.0 _TR1_JOB_5 14.0 _TR2_MORTDUE_4 0.436989
11 6.0 _TR1_JOB_6 14.0 _TR2_MORTDUE_4 0.424610
12 23.0 _TR2_LOAN_1 12.0 _TR2_MORTDUE_2 0.411240
13 6.0 _TR1_JOB_6 8.0 _TR1_REASON_1 0.374103
14 1.0 _TR1_JOB_1 14.0 _TR2_MORTDUE_4 0.371131
15 1.0 _TR1_JOB_1 12.0 _TR2_MORTDUE_2 0.370636
16 23.0 _TR2_LOAN_1 8.0 _TR1_REASON_1 0.359247
17 23.0 _TR2_LOAN_1 14.0 _TR2_MORTDUE_4 0.353305
18 7.0 _TR1_JOB_7 8.0 _TR1_REASON_1 0.352315
19 16.0 _TR2_MORTDUE_6 8.0 _TR1_REASON_1 0.352315

elapsed 0.00192s · user 0.00164s · sys 0.000218s · mem 0.964MB


Screen Variables

The screenVariables action makes one of the following recommendations for each input variable:

  • Remove variable if there are significant data-quality issues.
  • Transform and keep variable if there are some data-quality issues.
  • Keep variable if there are no data quality issues.

The screenVariables action considers the following features of the input variables to make its recommendation:

  • Missing rate exceeds threshold in screenPolicy (default is 90).
  • Constant value across input variable.
  • Mutual Information (MI) about the target is below the threshold in the screenPolicy (default is 0.05)
  • Entropy across levels.
  • Entropy reduction of target exceeds threshold in screenPolicy (default is 90); also referred to as leakage.
  • Symmetric Uncertainty (SU) of two variables exceed threshold in screenPolicy (default is 1); also referred to as redundancy.

This action returns a CAS table listing all the input variables, the recommended action, and the reason for the recommended action.


In [12]:
conn.dataSciencePilot.screenVariables(
    table = tbl, 
    target = trt, 
    casOut = {'name': 'SCREEN_VARIABLES_OUT_PY', 'replace': True}, 
    screenPolicy = {}
)
conn.fetch(table = {'name': 'SCREEN_VARIABLES_OUT_PY'})


Out[12]:
§ Fetch
Selected Rows from Table SCREEN_VARIABLES_OUT_PY
Variable Recommendation Reason
0 REASON keep passed all screening tests
1 JOB keep passed all screening tests
2 LOAN keep passed all screening tests
3 MORTDUE keep passed all screening tests
4 VALUE keep passed all screening tests
5 YOJ keep passed all screening tests
6 DEROG keep passed all screening tests
7 DELINQ keep passed all screening tests
8 CLAGE keep passed all screening tests
9 NINQ keep passed all screening tests
10 CLNO keep passed all screening tests
11 DEBTINC keep passed all screening tests

elapsed 0.00164s · user 0.00159s · mem 0.966MB


Feature Machine

The featureMachine action creates an automated and parallel generation of features. The featureMachine action first explores the data and groups the input variables into categories with the same statistical profile, like the exploreData action. Next the featureMachine action screens variables to identify noise variables to exclude from further analysis, like the screenVariables action. Finally, the featureMachine action generates new features by using the available structured pipelines:

  • Missing indicator addition.
  • Mode imputation and rare value grouping.
  • Missing level and rare value grouping.
  • Median imputation.
  • Mode imputation and label encoding.
  • Missing level and label encoding.
  • Yeo-Johnson transformation and median imputation.
  • Box-Cox transformation.
  • Quantile binning with missing bins.
  • Regression tree binning.
  • Decision tree binning.
  • MDLP binning.
  • Target encoding.
  • Date, time, and datetime transformations.

Depending on the parameters specified in the transformationPolicy, the featureMachine action can generate several features for each input variable. This action returns four CAS tables: the first lists information around the transformation pipelines, the second lists information around the transformed features, the third is the input table scored with the transformed features, and the fourth is an analytical store for scoring any additional input tables.


In [13]:
conn.dataSciencePilot.featureMachine(
    table = tbl, 
    target = trt, 
    copyVars = trt, 
    explorationPolicy = expo, 
    screenPolicy = scpo, 
    transformationPolicy = trpo, 
    transformationOut       = {"name" : "TRANSFORMATION_OUT", "replace" : True},
    featureOut              = {"name" : "FEATURE_OUT", "replace" : True},
    casOut                  = {"name" : "CAS_OUT", "replace" : True},
    saveState               = {"name" : "ASTORE_OUT", "replace" : True}  
)


Out[13]:
§ OutputCasTables
casLib Name Rows Columns casTable
0 CASUSER(sasdemo) TRANSFORMATION_OUT 33 21 CASTable('TRANSFORMATION_OUT', caslib='CASUSER...
1 CASUSER(sasdemo) FEATURE_OUT 59 9 CASTable('FEATURE_OUT', caslib='CASUSER(sasdem...
2 CASUSER(sasdemo) CAS_OUT 5960 60 CASTable('CAS_OUT', caslib='CASUSER(sasdemo)')
3 CASUSER(sasdemo) ASTORE_OUT 1 2 CASTable('ASTORE_OUT', caslib='CASUSER(sasdemo)')

elapsed 0.305s · user 0.524s · sys 0.119s · mem 44.7MB


In [14]:
conn.fetch(table = {'name': 'TRANSFORMATION_OUT'})


Out[14]:
§ Fetch
Selected Rows from Table TRANSFORMATION_OUT
FTGPipelineId Name NVariables IsInteraction ImputeMethod OutlierMethod OutlierTreat OutlierArgs FunctionMethod FunctionArgs ... MapIntervalArgs HashMethod HashArgs DateTimeMethod DiscretizeMethod DiscretizeArgs CatTransMethod CatTransArgs InteractionMethod InteractionSynthesizer
0 1.0 miss_ind 5.0 NaN ... NaN MissIndicator 2.0 NaN NaN
1 2.0 grp_rare1 2.0 Mode NaN ... NaN NaN NaN Group Rare 5.0
2 3.0 hc_tar_frq_rat 1.0 NaN ... 10.0 NaN NaN NaN
3 4.0 hc_lbl_cnt 1.0 NaN ... 0.0 NaN NaN NaN
4 5.0 hc_cnt 1.0 NaN ... 0.0 NaN NaN NaN
5 6.0 hc_cnt_log 1.0 NaN Log e ... 0.0 NaN NaN NaN
6 7.0 lchehi_lab 1.0 NaN ... NaN NaN NaN Label (Sparse One-Hot) 0.0
7 8.0 lcnhenhi_grp_rare 1.0 NaN ... NaN NaN NaN Group Rare 5.0
8 9.0 lcnhenhi_dtree5 1.0 NaN ... NaN NaN NaN DTree 5.0
9 10.0 lcnhenhi_dtree10 1.0 NaN ... NaN NaN NaN DTree 10.0
10 11.0 ho_winsor 2.0 Median Modified IQR Winsor 0.0 ... NaN NaN NaN NaN
11 12.0 ho_quan_disct5 2.0 Modified IQR Trim 0.0 ... NaN NaN Equal-Freq (Quantile) 5.0 NaN
12 13.0 ho_quan_disct10 2.0 Modified IQR Trim 0.0 ... NaN NaN Equal-Freq (Quantile) 10.0 NaN
13 14.0 ho_dtree_disct5 2.0 NaN ... NaN NaN DTree 5.0 NaN
14 15.0 ho_dtree_disct10 2.0 NaN ... NaN NaN DTree 10.0 NaN
15 16.0 hk_yj_n2 1.0 Median NaN Yeo-Johnson -2 ... NaN NaN NaN NaN
16 17.0 hk_yj_n1 1.0 Median NaN Yeo-Johnson -1 ... NaN NaN NaN NaN
17 18.0 hk_yj_0 1.0 Median NaN Yeo-Johnson 0 ... NaN NaN NaN NaN
18 19.0 hk_yj_p1 1.0 Median NaN Yeo-Johnson 1 ... NaN NaN NaN NaN
19 20.0 hk_yj_p2 1.0 Median NaN Yeo-Johnson 2 ... NaN NaN NaN NaN

20 rows × 21 columns

elapsed 0.00261s · user 0.00255s · mem 1MB


In [15]:
conn.fetch(table = {'name': 'FEATURE_OUT'})


Out[15]:
§ Fetch
Selected Rows from Table FEATURE_OUT
FeatureId Name IsNominal FTGPipelineId NInputs InputVar1 InputVar2 InputVar3 Label
0 1.0 cpy_int_med_imp_CLAGE 0.0 32.0 1.0 CLAGE CLAGE: Low missing rate - median imputation
1 2.0 miss_ind_CLAGE 1.0 1.0 1.0 CLAGE CLAGE: Significant missing - missing indicator
2 3.0 nhoks_nloks_dtree_10_CLAGE 1.0 31.0 1.0 CLAGE CLAGE: Not high (outlier, kurtosis, skewness) ...
3 4.0 nhoks_nloks_dtree_5_CLAGE 1.0 30.0 1.0 CLAGE CLAGE: Not high (outlier, kurtosis, skewness) ...
4 5.0 nhoks_nloks_log_CLAGE 0.0 26.0 1.0 CLAGE CLAGE: Not high (outlier, kurtosis, skewness) ...
5 6.0 nhoks_nloks_pow_n0_5_CLAGE 0.0 25.0 1.0 CLAGE CLAGE: Not high (outlier, kurtosis, skewness) ...
6 7.0 nhoks_nloks_pow_n1_CLAGE 0.0 24.0 1.0 CLAGE CLAGE: Not high (outlier, kurtosis, skewness) ...
7 8.0 nhoks_nloks_pow_n2_CLAGE 0.0 23.0 1.0 CLAGE CLAGE: Not high (outlier, kurtosis, skewness) ...
8 9.0 nhoks_nloks_pow_p0_5_CLAGE 0.0 27.0 1.0 CLAGE CLAGE: Not high (outlier, kurtosis, skewness) ...
9 10.0 nhoks_nloks_pow_p1_CLAGE 0.0 28.0 1.0 CLAGE CLAGE: Not high (outlier, kurtosis, skewness) ...
10 11.0 nhoks_nloks_pow_p2_CLAGE 0.0 29.0 1.0 CLAGE CLAGE: Not high (outlier, kurtosis, skewness) ...
11 12.0 cpy_int_med_imp_DEBTINC 0.0 32.0 1.0 DEBTINC DEBTINC: Low missing rate - median imputation
12 13.0 hk_dtree_disct10_DEBTINC 1.0 22.0 1.0 DEBTINC DEBTINC: High kurtosis - ten bin decision tree...
13 14.0 hk_dtree_disct5_DEBTINC 1.0 21.0 1.0 DEBTINC DEBTINC: High kurtosis - five bin decision tre...
14 15.0 hk_yj_0_DEBTINC 0.0 18.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=0)...
15 16.0 hk_yj_n1_DEBTINC 0.0 17.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=-1...
16 17.0 hk_yj_n2_DEBTINC 0.0 16.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=-2...
17 18.0 hk_yj_p1_DEBTINC 0.0 19.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=1)...
18 19.0 hk_yj_p2_DEBTINC 0.0 20.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=2)...
19 20.0 miss_ind_DEBTINC 1.0 1.0 1.0 DEBTINC DEBTINC: Significant missing - missing indicator

elapsed 0.00187s · user 0.000864s · sys 0.000959s · mem 0.965MB


In [16]:
conn.fetch(table = {'name': 'CAS_OUT'})


Out[16]:
§ Fetch
Selected Rows from Table CAS_OUT
BAD cpy_int_med_imp_CLAGE miss_ind_CLAGE nhoks_nloks_dtree_10_CLAGE nhoks_nloks_dtree_5_CLAGE nhoks_nloks_log_CLAGE nhoks_nloks_pow_n0_5_CLAGE nhoks_nloks_pow_n1_CLAGE nhoks_nloks_pow_n2_CLAGE nhoks_nloks_pow_p0_5_CLAGE ... hc_lbl_cnt_LOAN hc_tar_frq_rat_LOAN cpy_nom_miss_lev_lab_NINQ lcnhenhi_dtree10_NINQ lcnhenhi_dtree5_NINQ lcnhenhi_grp_rare_NINQ miss_ind_NINQ cpy_nom_miss_lev_lab_JOB lchehi_lab_JOB cpy_nom_miss_lev_lab_REASON
0 1.0 94.366667 1.0 3.0 2.0 4.557729 0.102400 0.010486 0.000110 9.765586 ... 528.0 0.5 2.0 2.0 2.0 2.0 1.0 3.0 3.0 2.0
1 1.0 121.833333 1.0 4.0 2.0 4.810828 0.090228 0.008141 0.000066 11.083020 ... 461.0 0.5 1.0 1.0 1.0 1.0 1.0 3.0 3.0 2.0
2 1.0 149.466667 1.0 4.0 2.0 5.013742 0.081523 0.006646 0.000044 12.266486 ... 385.0 0.5 2.0 2.0 2.0 2.0 1.0 3.0 3.0 2.0
3 1.0 173.466667 0.0 0.0 0.0 5.161734 0.075708 0.005732 0.000033 13.208583 ... 385.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 0.0 93.333333 1.0 2.0 2.0 4.546835 0.102960 0.010601 0.000112 9.712535 ... 359.0 0.5 1.0 1.0 1.0 1.0 1.0 2.0 2.0 2.0
5 1.0 101.466002 1.0 3.0 2.0 4.629531 0.098789 0.009759 0.000095 10.122549 ... 359.0 0.5 2.0 2.0 2.0 2.0 1.0 3.0 3.0 2.0
6 1.0 77.100000 1.0 2.0 2.0 4.357990 0.113155 0.012804 0.000164 8.837420 ... 401.0 0.5 2.0 2.0 2.0 2.0 1.0 3.0 3.0 2.0
7 1.0 88.766030 1.0 2.0 2.0 4.497207 0.105547 0.011140 0.000124 9.474494 ... 401.0 0.5 1.0 1.0 1.0 1.0 1.0 3.0 3.0 2.0
8 1.0 216.933333 1.0 7.0 4.0 5.384189 0.067739 0.004589 0.000021 14.762565 ... 259.0 0.5 2.0 2.0 2.0 2.0 1.0 3.0 3.0 2.0
9 1.0 115.800000 1.0 3.0 2.0 4.760463 0.092529 0.008562 0.000073 10.807405 ... 259.0 0.5 1.0 1.0 1.0 1.0 1.0 5.0 5.0 2.0
10 1.0 173.466667 0.0 0.0 0.0 5.161734 0.075708 0.005732 0.000033 13.208583 ... 259.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
11 1.0 122.533333 1.0 4.0 2.0 4.816511 0.089972 0.008095 0.000066 11.114555 ... 259.0 0.5 2.0 2.0 2.0 2.0 1.0 2.0 2.0 2.0
12 1.0 86.066667 1.0 2.0 2.0 4.466674 0.107170 0.011485 0.000132 9.330952 ... 259.0 0.5 3.0 3.0 3.0 3.0 1.0 3.0 3.0 2.0
13 0.0 147.133333 1.0 4.0 2.0 4.998113 0.082162 0.006751 0.000046 12.171004 ... 259.0 0.5 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.0
14 1.0 123.000000 1.0 4.0 2.0 4.820282 0.089803 0.008065 0.000065 11.135529 ... 447.0 0.5 1.0 1.0 1.0 1.0 1.0 3.0 3.0 2.0
15 1.0 300.866667 1.0 9.0 5.0 5.709985 0.057556 0.003313 0.000011 17.374311 ... 339.0 0.5 1.0 1.0 1.0 1.0 1.0 3.0 3.0 2.0
16 1.0 122.900000 1.0 4.0 2.0 4.819475 0.089839 0.008071 0.000065 11.131038 ... 339.0 0.5 2.0 2.0 2.0 2.0 1.0 1.0 1.0 2.0
17 1.0 173.466667 0.0 0.0 0.0 5.161734 0.075708 0.005732 0.000033 13.208583 ... 339.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
18 1.0 54.600000 1.0 1.0 1.0 4.018183 0.134110 0.017986 0.000323 7.456541 ... 353.0 0.5 2.0 2.0 2.0 2.0 1.0 3.0 3.0 2.0
19 0.0 90.992533 1.0 2.0 2.0 4.521707 0.104261 0.010870 0.000118 9.591274 ... 353.0 0.5 1.0 1.0 1.0 1.0 1.0 2.0 2.0 2.0

20 rows × 60 columns

elapsed 0.0028s · user 0.00274s · mem 1.03MB


Generate Shadow Features

The generateShadowFeatures Action performs a scalable random permutation of input features to create shadow features. The shadow features are randomly selected from a matching distribution of each input feature. These shadow features can be used for all-relevant feature selection which removes the inputs whose variable importance is lower than the shadow feature’s variable importance. The shadow features can also be used in a post-fit analysis using Permutation Feature Importance (PFI). By replacing each input with its shadow feature one-by-one and measuring the change on model performance, one can determine that features importance based on relative size of the model’s performance change.

In the example below, I will use the outputs of the feature machine for all-relevant feature selection. This involves getting the variable metadata from my feature machine table, generating my shadow features, finding the variable importance for my features and shadow features using a random forest, and comparing each variable's performance to its shadow features. In the end, I will only keep variables with a higher importance than its shadow feature for the next phase.


In [17]:
# Getting variable names and metadata from feature machine output
fm = conn.CASTable('FEATURE_OUT').to_frame()
inputs = fm['Name'].to_list()
nom = fm.loc[fm['IsNominal'] == 1]
nom = nom['Name'].to_list()

# Generating Shadow Features
conn.dataSciencePilot.generateShadowFeatures(
    table = 'CAS_OUT', 
    nProbes = 2, 
    inputs = inputs, 
    nominals = nom,
    casout={"name" : "SHADOW_FEATURES_OUT", "replace" : True},
    copyVars = trt
)
conn.fetch(table = {"name" : "SHADOW_FEATURES_OUT"})


Out[17]:
§ Fetch
Selected Rows from Table SHADOW_FEATURES_OUT
BAD _fpi_cpy_int_med_imp_CLAGE_1 _fpi_cpy_int_med_imp_CLAGE_2 _fpi_cpy_int_med_imp_DEBTINC_1 _fpi_cpy_int_med_imp_DEBTINC_2 _fpi_cpy_int_med_imp_MORTDUE_1 _fpi_cpy_int_med_imp_MORTDUE_2 _fpi_cpy_int_med_imp_VALUE_1 _fpi_cpy_int_med_imp_VALUE_2 _fpi_cpy_int_med_imp_YOJ_1 ... _fpn_miss_ind_YOJ_1 _fpn_miss_ind_YOJ_2 _fpn_nhoks_nloks_dtree_10_CLAGE_1 _fpn_nhoks_nloks_dtree_10_CLAGE_2 _fpn_nhoks_nloks_dtree_10_YOJ_1 _fpn_nhoks_nloks_dtree_10_YOJ_2 _fpn_nhoks_nloks_dtree_5_CLAGE_1 _fpn_nhoks_nloks_dtree_5_CLAGE_2 _fpn_nhoks_nloks_dtree_5_YOJ_1 _fpn_nhoks_nloks_dtree_5_YOJ_2
0 1.0 212.857966 306.301194 41.920768 37.885292 126964.545638 143702.041714 115978.674484 67369.485639 16.501445 ... 1.0 1.0 5.0 5.0 8.0 4.0 5.0 3.0 2.0 4.0
1 1.0 107.271223 173.466669 47.838093 43.263531 98964.859966 71690.484398 140962.807997 43080.986769 0.028263 ... 0.0 1.0 4.0 3.0 6.0 2.0 1.0 4.0 2.0 4.0
2 1.0 184.889813 212.829761 38.964341 36.458693 31115.897464 65019.225046 90000.081104 83919.782979 23.570016 ... 1.0 1.0 4.0 6.0 4.0 8.0 2.0 4.0 4.0 4.0
3 1.0 622.587866 107.172996 34.818262 36.463339 62360.100423 94601.260163 49543.401705 31888.847989 26.744998 ... 1.0 1.0 7.0 9.0 9.0 9.0 4.0 2.0 4.0 1.0
4 0.0 121.889601 218.133473 28.422306 41.635977 48107.013889 75397.282159 31532.854191 65026.000575 5.013121 ... 1.0 1.0 8.0 5.0 9.0 3.0 2.0 1.0 4.0 4.0
5 1.0 181.232278 112.785876 34.769221 36.994494 65020.344536 47127.398889 288193.525528 115523.037663 0.097692 ... 1.0 1.0 6.0 7.0 9.0 10.0 5.0 4.0 0.0 2.0
6 1.0 208.091265 81.477013 31.280852 28.122599 20635.084053 54337.933251 46838.335040 35974.002994 1.063992 ... 1.0 1.0 0.0 9.0 8.0 8.0 4.0 2.0 2.0 5.0
7 1.0 202.794672 261.165400 34.818263 26.903285 57406.281905 41256.638428 195909.103089 86605.889140 5.015190 ... 1.0 1.0 1.0 8.0 7.0 9.0 5.0 2.0 4.0 4.0
8 1.0 108.271540 130.776027 27.694422 31.058132 140511.373088 52062.739878 94710.729253 39620.951869 3.005884 ... 1.0 1.0 9.0 9.0 7.0 9.0 4.0 3.0 1.0 5.0
9 1.0 114.463463 367.218162 34.818270 38.265665 159481.832637 88873.732650 26510.127331 182172.822104 3.020234 ... 1.0 1.0 4.0 8.0 4.0 10.0 5.0 2.0 2.0 1.0
10 1.0 95.409991 142.059951 39.944378 40.958111 112189.134571 30315.160184 128831.408335 94700.092549 10.014143 ... 1.0 1.0 5.0 6.0 9.0 10.0 2.0 4.0 5.0 4.0
11 1.0 89.509339 298.772479 25.195575 23.596494 64505.016457 5918.092720 82232.870371 87001.388840 12.049086 ... 1.0 1.0 6.0 7.0 10.0 6.0 2.0 2.0 0.0 4.0
12 1.0 115.560156 312.944545 26.493703 30.676435 191992.022971 113012.540859 72169.216738 99158.332774 10.159464 ... 1.0 1.0 6.0 9.0 4.0 9.0 2.0 4.0 2.0 1.0
13 0.0 625.880332 368.142485 39.872955 40.591487 84981.145822 62655.117888 166417.021621 697244.063305 9.016728 ... 1.0 1.0 6.0 5.0 8.0 1.0 4.0 0.0 2.0 5.0
14 1.0 132.269452 266.852700 40.620141 31.823471 90270.357142 100484.635156 89779.631224 99572.811032 10.084133 ... 1.0 1.0 9.0 4.0 9.0 9.0 3.0 2.0 4.0 5.0
15 1.0 126.489363 161.355046 35.841682 26.158397 61979.841665 65019.079157 82293.802030 76526.707935 0.069934 ... 1.0 1.0 5.0 9.0 5.0 8.0 2.0 4.0 4.0 5.0
16 1.0 177.252156 121.154932 34.818275 20.582901 138772.676671 85141.457831 57041.144059 78680.002903 1.224489 ... 0.0 1.0 3.0 2.0 9.0 4.0 5.0 2.0 2.0 1.0
17 1.0 222.075743 230.390596 34.818274 41.524796 5828.456153 31805.560938 103250.638867 52594.394327 16.477974 ... 1.0 1.0 2.0 8.0 8.0 9.0 2.0 4.0 2.0 4.0
18 1.0 222.783684 185.874565 35.867232 22.032422 65020.251227 56977.475725 158644.708235 65340.716333 12.016391 ... 1.0 1.0 9.0 6.0 2.0 9.0 2.0 4.0 4.0 5.0
19 0.0 203.538315 118.651494 42.397829 39.043616 67283.352046 99171.454396 48825.120178 101228.427251 14.015437 ... 1.0 1.0 4.0 0.0 0.0 2.0 4.0 2.0 4.0 2.0

20 rows × 119 columns

elapsed 0.00366s · user 0.00268s · sys 0.000923s · mem 1.16MB


In [18]:
# Getting Feature Importance for Orginal Features
feats = conn.decisionTree.forestTrain(
    table = 'CAS_OUT', 
    inputs = inputs, 
    target = trt, 
    varImp = True)
real_features = feats.DTreeVarImpInfo

# Getting Feature Importance for Shadow Features
inp = conn.CASTable('SHADOW_FEATURES_OUT').axes[1].to_list()
shadow_feats = conn.decisionTree.forestTrain(
    table = 'SHADOW_FEATURES_OUT', 
    inputs = inp, 
    target = trt, 
    varImp = True)
sf = shadow_feats.DTreeVarImpInfo

# Building dataframe for easy comparison 
feat_comp = pd.DataFrame(columns=['Variable', 'Real_Imp', 'SF_Imp1', 'SF_Imp2'])
# Filling Variable Column of Data Frame from Feature
feat_comp['Variable'] = real_features['Variable']
# Filling Importance Column of Data Frame from Feature
feat_comp['Real_Imp'] = real_features['Importance']
# Finding each Feature's Shadow Feature
for index, row in sf.iterrows():
    temp_name = row['Variable']
    temp_num = int(temp_name[-1:])
    temp_name = temp_name[5:-2]
    temp_imp = row['Importance']
    for ind, ro in feat_comp.iterrows():
        if temp_name == ro['Variable']:
            if temp_num == 1:
                # Filling First Shadow Feature's Importance
                feat_comp.at[ind, 'SF_Imp1'] = temp_imp
            else:
                # Filling First Shadow Feature's Importance
                feat_comp.at[ind, 'SF_Imp2'] = temp_imp
feat_comp.head()


Out[18]:
Variable Real_Imp SF_Imp1 SF_Imp2
0 hk_dtree_disct10_DEBTINC 50.625820 0.356029 0.42216
1 hk_dtree_disct5_DEBTINC 44.679171 0.0912159 0.153254
2 miss_ind_DEBTINC 29.850201 0.0304762 NaN
3 cpy_int_med_imp_DEBTINC 24.158801 0.595817 0.525752
4 grp_rare1_DELINQ 17.844217 0.0468703 0.0970889

In [19]:
# Determining which features have an importance smaller than their shadow feature's importance
to_drop = list()
for ind, ro in feat_comp.iterrows():
    if ro['Real_Imp'] <= ro['SF_Imp1'] or ro['Real_Imp'] <= ro['SF_Imp2']:
        to_drop.append(ro['Variable'])
to_drop


Out[19]:
['ho_winsor_VALUE',
 'ho_winsor_MORTDUE',
 'nhoks_nloks_pow_n1_YOJ',
 'nhoks_nloks_dtree_10_YOJ',
 'hc_cnt_LOAN',
 'nhoks_nloks_pow_n2_YOJ',
 'nhoks_nloks_pow_p0_5_YOJ',
 'nhoks_nloks_pow_p2_YOJ',
 'hc_cnt_log_LOAN',
 'nhoks_nloks_pow_p1_YOJ',
 'ho_quan_disct10_MORTDUE',
 'ho_dtree_disct10_MORTDUE',
 'miss_ind_CLAGE',
 'ho_dtree_disct5_MORTDUE',
 'miss_ind_NINQ']

In [20]:
# Dropping Columns from CAS_OUT
CAS_OUT=conn.CASTable('CAS_OUT')
CAS_OUT = CAS_OUT.drop(to_drop, axis=1)

Select Features

The selectFeatures action performs a filter-based selection by the criterion selected in the selectionPolicy (default is the best ten input variables according to the Mutual Information statistic). The criterion available for selection include Chi-Square, Cramer’s V, F-test, G2, Information Value, Mutual Information, Normalized Mutual Information statistic, Pearson correlation, and the Symmetric Uncertainty statistic. This action returns a CAS table listing the variables, their rank according to the selected criterion, and the value of the selected criterion.


In [21]:
conn.dataSciencePilot.screenVariables(
    table='CAS_OUT', 
    target=trt, 
    screenPolicy=scpo, 
    casout={"name" : "SCREEN_VARIABLES_OUT", "replace" : True}
)
conn.fetch(table = {"name" : "SCREEN_VARIABLES_OUT"})


Out[21]:
§ Fetch
Selected Rows from Table SCREEN_VARIABLES_OUT
Variable Recommendation Reason
0 cpy_int_med_imp_CLAGE keep passed all screening tests
1 miss_ind_CLAGE keep passed all screening tests
2 nhoks_nloks_dtree_10_CLAGE keep passed all screening tests
3 nhoks_nloks_dtree_5_CLAGE keep passed all screening tests
4 nhoks_nloks_log_CLAGE keep passed all screening tests
5 nhoks_nloks_pow_n0_5_CLAGE keep passed all screening tests
6 nhoks_nloks_pow_n1_CLAGE keep passed all screening tests
7 nhoks_nloks_pow_n2_CLAGE keep passed all screening tests
8 nhoks_nloks_pow_p0_5_CLAGE keep passed all screening tests
9 nhoks_nloks_pow_p1_CLAGE keep passed all screening tests
10 nhoks_nloks_pow_p2_CLAGE keep passed all screening tests
11 cpy_int_med_imp_DEBTINC keep passed all screening tests
12 hk_dtree_disct10_DEBTINC keep passed all screening tests
13 hk_dtree_disct5_DEBTINC keep passed all screening tests
14 hk_yj_0_DEBTINC keep passed all screening tests
15 hk_yj_n1_DEBTINC keep passed all screening tests
16 hk_yj_n2_DEBTINC keep passed all screening tests
17 hk_yj_p1_DEBTINC keep passed all screening tests
18 hk_yj_p2_DEBTINC keep passed all screening tests
19 miss_ind_DEBTINC keep passed all screening tests

elapsed 0.00187s · user 0.00172s · mem 0.965MB


Data Science Automated Machine Learning Pipeline

The dsAutoMl action creates a policy-based, scalable, end-to-end automated machine learning pipeline for both regression and classification problems. The only input required from the user is the input data set and the target variable, but optional parameters include the policy parameters for data exploration, variable screening, feature selection, and feature transformation. Overriding the default policy parameters allow a data scientist to configure their pipeline in their data science workflow. In addition, a data scientist may also select additional models to consider. By default, only a decision tree model is included in the pipeline, but neural networks, random forest models, and gradient boosting models are also available.

The dsAutoMl action first explores the data and groups the input variables into categories with the same statistical profile, like the exploreData action. Next the dsAutoMl action screens variables to identify noise variables to exclude from further analysis, like the screenVariables action. Then, the dsAutoMl action generates several new features for the input variables, like the featureMachine action. After there are various new cleaned features, the dsAutoMl action will select features based on selected criterion, like the selectFeatures action.

From here, various pipelines are created using subsets of the selected features, chosen for each pipeline using a feature-representation algorithm. Then the chosen models are added to each pipeline and the hyperparameters for the selected models are optimized, like the modelComposer action of the Autotune action set. These hyperparameters are optimized for the selected objective parameter when cross-validated. By default, classification problems are optimized to have the smallest Misclassification Error Rate (MCE) and regression problems are optimized to have the smallest Average Square Error (ASR). Data scientists can then select their champion and challenger models from the pipelines.

This action returns several CAS tables: the first lists information around the transformation pipelines, the second lists information around the transformed features, the third lists pipeline performance according to the objective parameter and the last tables are analytical stores for creating the feature set and scoring with our model when new data is available.


In [22]:
conn.dataSciencePilot.dsAutoMl(
    table = tbl,
    target = trt, 
    explorationPolicy = expo, 
    screenPolicy = scpo, 
    selectionPolicy = sepo,
    transformationPolicy = trpo,
     modelTypes              = ["decisionTree", "gradboost"],
        objective               = "ASE",
        sampleSize              = 10,
        topKPipelines           = 10,
        kFolds                  = 5,
        transformationOut       = {"name" : "TRANSFORMATION_OUT_PY", "replace" : True},
        featureOut              = {"name" : "FEATURE_OUT_PY", "replace" : True},
        pipelineOut             = {"name" : "PIPELINE_OUT_PY", "replace" : True},
        saveState               = {"modelNamePrefix" : "ASTORE_OUT_PY", "replace" : True, "topK":1} 
)


NOTE: Added action set 'autotune'.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: Added action set 'decisionTree'.
WARNING: The VALUELIST for the tuning parameter 'M' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'RIDGE' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: Early stopping is activated; 'NTREE' will not be tuned.
NOTE: Added action set 'autotune'.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: The number of bins will not be tuned since all inputs are nominal.
NOTE: Added action set 'decisionTree'.
WARNING: The VALUELIST for the tuning parameter 'M' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'RIDGE' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: Early stopping is activated; 'NTREE' will not be tuned.
NOTE: The number of bins will not be tuned since all inputs are nominal.
NOTE: Added action set 'autotune'.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: The number of bins will not be tuned since all inputs are nominal.
NOTE: Added action set 'decisionTree'.
WARNING: The VALUELIST for the tuning parameter 'M' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'RIDGE' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: Early stopping is activated; 'NTREE' will not be tuned.
NOTE: The number of bins will not be tuned since all inputs are nominal.
NOTE: Added action set 'autotune'.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: The number of bins will not be tuned since all inputs are nominal.
NOTE: Added action set 'decisionTree'.
WARNING: The VALUELIST for the tuning parameter 'M' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'RIDGE' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: Early stopping is activated; 'NTREE' will not be tuned.
NOTE: The number of bins will not be tuned since all inputs are nominal.
NOTE: Added action set 'autotune'.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: Added action set 'decisionTree'.
WARNING: The VALUELIST for the tuning parameter 'M' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'RIDGE' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: Early stopping is activated; 'NTREE' will not be tuned.
NOTE: Added action set 'autotune'.
WARNING: The VALUELIST for the tuning parameter 'M' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'RIDGE' contains only one element.
WARNING: The VALUELIST for the tuning parameter 'NBINS' contains only one element.
NOTE: Early stopping is activated; 'NTREE' will not be tuned.
NOTE: Added action set 'decisionTree'.
NOTE: 5516602 bytes were written to the table "ASTORE_OUT_PY_gradBoost_1" in the caslib "CASUSER(sasdemo)".
Out[22]:
§ ModelInfo_1_DecisionTree
Decision Tree for __TEMP_FEATURE_MACHINE_CASOUT___AUTOTUNE_11FEB2020:09:55:51
Descr Value
0 Number of Tree Nodes 599.00000
1 Max Number of Branches 2.00000
2 Number of Levels 15.00000
3 Number of Leaves 300.00000
4 Number of Bins 100.00000
5 Minimum Size of Leaves 5.00000
6 Maximum Size of Leaves 442.00000
7 Number of Variables 4.00000
8 Confidence Level for Pruning 0.25000
9 Number of Observations Used 5960.00000
10 Misclassification Error (%) 11.47651

§ ScoreInfo_1_DecisionTree
Descr Value
0 Number of Observations Read 5960
1 Number of Observations Used 5960
2 Misclassification Error (%) 11.476510067

§ EncodedName_1_DecisionTree
LEVNAME LEVINDEX VARNAME
0 1 0 P_BAD1
1 0 1 P_BAD0

§ EncodedTargetName_1_DecisionTree
LEVNAME LEVINDEX VARNAME
0 0 I_BAD

§ ROCInfo_1_DecisionTree
ROC Information for _AUTOTUNE_DEFAULT_SCORE_TABLE_
Variable Event CutOff TP FP FN TN Sensitivity Specificity KS ... F_HALF FPR ACC FDR F1 C Gini Gamma Tau MISCEVENT
0 P_BAD0 0 0.00 4771.0 1189.0 0.0 0.0 1.000000 0.000000 0.0 ... 0.833770 1.000000 0.800503 0.199497 0.889200 0.922537 0.845073 0.854121 0.269958 0.199497
1 P_BAD0 0 0.01 4771.0 1015.0 0.0 174.0 1.000000 0.146341 0.0 ... 0.854558 0.853659 0.829698 0.175423 0.903855 0.922537 0.845073 0.854121 0.269958 0.170302
2 P_BAD0 0 0.02 4771.0 1015.0 0.0 174.0 1.000000 0.146341 0.0 ... 0.854558 0.853659 0.829698 0.175423 0.903855 0.922537 0.845073 0.854121 0.269958 0.170302
3 P_BAD0 0 0.03 4771.0 1015.0 0.0 174.0 1.000000 0.146341 0.0 ... 0.854558 0.853659 0.829698 0.175423 0.903855 0.922537 0.845073 0.854121 0.269958 0.170302
4 P_BAD0 0 0.04 4770.0 989.0 1.0 200.0 0.999790 0.168209 0.0 ... 0.857698 0.831791 0.833893 0.171731 0.905983 0.922537 0.845073 0.854121 0.269958 0.166107
5 P_BAD0 0 0.05 4770.0 989.0 1.0 200.0 0.999790 0.168209 0.0 ... 0.857698 0.831791 0.833893 0.171731 0.905983 0.922537 0.845073 0.854121 0.269958 0.166107
6 P_BAD0 0 0.06 4770.0 989.0 1.0 200.0 0.999790 0.168209 0.0 ... 0.857698 0.831791 0.833893 0.171731 0.905983 0.922537 0.845073 0.854121 0.269958 0.166107
7 P_BAD0 0 0.07 4769.0 974.0 2.0 215.0 0.999581 0.180824 0.0 ... 0.859496 0.819176 0.836242 0.169598 0.907171 0.922537 0.845073 0.854121 0.269958 0.163758
8 P_BAD0 0 0.08 4767.0 949.0 4.0 240.0 0.999162 0.201850 0.0 ... 0.862493 0.798150 0.840101 0.166025 0.909126 0.922537 0.845073 0.854121 0.269958 0.159899
9 P_BAD0 0 0.09 4767.0 949.0 4.0 240.0 0.999162 0.201850 0.0 ... 0.862493 0.798150 0.840101 0.166025 0.909126 0.922537 0.845073 0.854121 0.269958 0.159899
10 P_BAD0 0 0.10 4766.0 939.0 5.0 250.0 0.998952 0.210261 0.0 ... 0.863687 0.789739 0.841611 0.164592 0.909889 0.922537 0.845073 0.854121 0.269958 0.158389
11 P_BAD0 0 0.11 4766.0 939.0 5.0 250.0 0.998952 0.210261 0.0 ... 0.863687 0.789739 0.841611 0.164592 0.909889 0.922537 0.845073 0.854121 0.269958 0.158389
12 P_BAD0 0 0.12 4765.0 931.0 6.0 258.0 0.998742 0.216989 0.0 ... 0.864634 0.783011 0.842785 0.163448 0.910481 0.922537 0.845073 0.854121 0.269958 0.157215
13 P_BAD0 0 0.13 4764.0 924.0 7.0 265.0 0.998533 0.222876 0.0 ... 0.865458 0.777124 0.843792 0.162447 0.910986 0.922537 0.845073 0.854121 0.269958 0.156208
14 P_BAD0 0 0.14 4764.0 924.0 7.0 265.0 0.998533 0.222876 0.0 ... 0.865458 0.777124 0.843792 0.162447 0.910986 0.922537 0.845073 0.854121 0.269958 0.156208
15 P_BAD0 0 0.15 4761.0 906.0 10.0 283.0 0.997904 0.238015 0.0 ... 0.867561 0.761985 0.846309 0.159873 0.912244 0.922537 0.845073 0.854121 0.269958 0.153691
16 P_BAD0 0 0.16 4754.0 868.0 17.0 321.0 0.996437 0.269975 0.0 ... 0.872006 0.730025 0.851510 0.154393 0.914847 0.922537 0.845073 0.854121 0.269958 0.148490
17 P_BAD0 0 0.17 4746.0 828.0 25.0 361.0 0.994760 0.303616 0.0 ... 0.876713 0.696384 0.856879 0.148547 0.917545 0.922537 0.845073 0.854121 0.269958 0.143121
18 P_BAD0 0 0.18 4746.0 828.0 25.0 361.0 0.994760 0.303616 0.0 ... 0.876713 0.696384 0.856879 0.148547 0.917545 0.922537 0.845073 0.854121 0.269958 0.143121
19 P_BAD0 0 0.19 4746.0 828.0 25.0 361.0 0.994760 0.303616 0.0 ... 0.876713 0.696384 0.856879 0.148547 0.917545 0.922537 0.845073 0.854121 0.269958 0.143121
20 P_BAD0 0 0.20 4746.0 828.0 25.0 361.0 0.994760 0.303616 0.0 ... 0.876713 0.696384 0.856879 0.148547 0.917545 0.922537 0.845073 0.854121 0.269958 0.143121
21 P_BAD0 0 0.21 4729.0 760.0 42.0 429.0 0.991197 0.360807 0.0 ... 0.884686 0.639193 0.865436 0.138459 0.921832 0.922537 0.845073 0.854121 0.269958 0.134564
22 P_BAD0 0 0.22 4729.0 760.0 42.0 429.0 0.991197 0.360807 0.0 ... 0.884686 0.639193 0.865436 0.138459 0.921832 0.922537 0.845073 0.854121 0.269958 0.134564
23 P_BAD0 0 0.23 4727.0 753.0 44.0 436.0 0.990778 0.366695 0.0 ... 0.885504 0.633305 0.866275 0.137409 0.922251 0.922537 0.845073 0.854121 0.269958 0.133725
24 P_BAD0 0 0.24 4727.0 753.0 44.0 436.0 0.990778 0.366695 0.0 ... 0.885504 0.633305 0.866275 0.137409 0.922251 0.922537 0.845073 0.854121 0.269958 0.133725
25 P_BAD0 0 0.25 4727.0 753.0 44.0 436.0 0.990778 0.366695 0.0 ... 0.885504 0.633305 0.866275 0.137409 0.922251 0.922537 0.845073 0.854121 0.269958 0.133725
26 P_BAD0 0 0.26 4708.0 696.0 63.0 493.0 0.986795 0.414634 0.0 ... 0.892106 0.585366 0.872651 0.128793 0.925405 0.922537 0.845073 0.854121 0.269958 0.127349
27 P_BAD0 0 0.27 4708.0 696.0 63.0 493.0 0.986795 0.414634 0.0 ... 0.892106 0.585366 0.872651 0.128793 0.925405 0.922537 0.845073 0.854121 0.269958 0.127349
28 P_BAD0 0 0.28 4708.0 696.0 63.0 493.0 0.986795 0.414634 0.0 ... 0.892106 0.585366 0.872651 0.128793 0.925405 0.922537 0.845073 0.854121 0.269958 0.127349
29 P_BAD0 0 0.29 4702.0 681.0 69.0 508.0 0.985538 0.427250 0.0 ... 0.893814 0.572750 0.874161 0.126509 0.926137 0.922537 0.845073 0.854121 0.269958 0.125839
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
70 P_BAD0 0 0.70 4295.0 283.0 476.0 906.0 0.900231 0.761985 0.0 ... 0.930338 0.238015 0.872651 0.061817 0.918815 0.922537 0.845073 0.854121 0.269958 0.127349
71 P_BAD0 0 0.71 4257.0 267.0 514.0 922.0 0.892266 0.775442 0.0 ... 0.930817 0.224558 0.868960 0.059019 0.915976 0.922537 0.845073 0.854121 0.269958 0.131040
72 P_BAD0 0 0.72 4215.0 250.0 556.0 939.0 0.883463 0.789739 0.0 ... 0.931245 0.210261 0.864765 0.055991 0.912733 0.922537 0.845073 0.854121 0.269958 0.135235
73 P_BAD0 0 0.73 4207.0 247.0 564.0 942.0 0.881786 0.792262 0.0 ... 0.931288 0.207738 0.863926 0.055456 0.912087 0.922537 0.845073 0.854121 0.269958 0.136074
74 P_BAD0 0 0.74 4196.0 243.0 575.0 946.0 0.879480 0.795627 0.0 ... 0.931327 0.204373 0.862752 0.054742 0.911183 0.922537 0.845073 0.854121 0.269958 0.137248
75 P_BAD0 0 0.75 4196.0 243.0 575.0 946.0 0.879480 0.795627 0.0 ... 0.931327 0.204373 0.862752 0.054742 0.911183 0.922537 0.845073 0.854121 0.269958 0.137248
76 P_BAD0 0 0.76 4157.0 230.0 614.0 959.0 0.871306 0.806560 0.0 ... 0.931269 0.193440 0.858389 0.052428 0.907840 0.922537 0.845073 0.854121 0.269958 0.141611
77 P_BAD0 0 0.77 4117.0 218.0 654.0 971.0 0.862922 0.816653 0.0 ... 0.930985 0.183347 0.853691 0.050288 0.904239 0.922537 0.845073 0.854121 0.269958 0.146309
78 P_BAD0 0 0.78 4096.0 212.0 675.0 977.0 0.858520 0.821699 0.0 ... 0.930782 0.178301 0.851174 0.049211 0.902302 0.922537 0.845073 0.854121 0.269958 0.148826
79 P_BAD0 0 0.79 4096.0 212.0 675.0 977.0 0.858520 0.821699 0.0 ... 0.930782 0.178301 0.851174 0.049211 0.902302 0.922537 0.845073 0.854121 0.269958 0.148826
80 P_BAD0 0 0.80 4003.0 188.0 768.0 1001.0 0.839027 0.841884 1.0 ... 0.929417 0.158116 0.839597 0.044858 0.893327 0.922537 0.845073 0.854121 0.269958 0.160403
81 P_BAD0 0 0.81 3902.0 163.0 869.0 1026.0 0.817858 0.862910 0.0 ... 0.927678 0.137090 0.826846 0.040098 0.883205 0.922537 0.845073 0.854121 0.269958 0.173154
82 P_BAD0 0 0.82 3835.0 148.0 936.0 1041.0 0.803815 0.875526 0.0 ... 0.926194 0.124474 0.818121 0.037158 0.876171 0.922537 0.845073 0.854121 0.269958 0.181879
83 P_BAD0 0 0.83 3835.0 148.0 936.0 1041.0 0.803815 0.875526 0.0 ... 0.926194 0.124474 0.818121 0.037158 0.876171 0.922537 0.845073 0.854121 0.269958 0.181879
84 P_BAD0 0 0.84 3780.0 137.0 991.0 1052.0 0.792287 0.884777 0.0 ... 0.924703 0.115223 0.810738 0.034976 0.870166 0.922537 0.845073 0.854121 0.269958 0.189262
85 P_BAD0 0 0.85 3780.0 137.0 991.0 1052.0 0.792287 0.884777 0.0 ... 0.924703 0.115223 0.810738 0.034976 0.870166 0.922537 0.845073 0.854121 0.269958 0.189262
86 P_BAD0 0 0.86 3732.0 129.0 1039.0 1060.0 0.782226 0.891505 0.0 ... 0.923077 0.108495 0.804027 0.033411 0.864690 0.922537 0.845073 0.854121 0.269958 0.195973
87 P_BAD0 0 0.87 3706.0 125.0 1065.0 1064.0 0.776776 0.894870 0.0 ... 0.922120 0.105130 0.800336 0.032629 0.861660 0.922537 0.845073 0.854121 0.269958 0.199664
88 P_BAD0 0 0.88 3650.0 117.0 1121.0 1072.0 0.765039 0.901598 0.0 ... 0.919905 0.098402 0.792282 0.031059 0.855001 0.922537 0.845073 0.854121 0.269958 0.207718
89 P_BAD0 0 0.89 3570.0 107.0 1201.0 1082.0 0.748271 0.910008 0.0 ... 0.916371 0.089992 0.780537 0.029100 0.845170 0.922537 0.845073 0.854121 0.269958 0.219463
90 P_BAD0 0 0.90 3570.0 107.0 1201.0 1082.0 0.748271 0.910008 0.0 ... 0.916371 0.089992 0.780537 0.029100 0.845170 0.922537 0.845073 0.854121 0.269958 0.219463
91 P_BAD0 0 0.91 3513.0 101.0 1258.0 1088.0 0.736324 0.915055 0.0 ... 0.913559 0.084945 0.771980 0.027947 0.837925 0.922537 0.845073 0.854121 0.269958 0.228020
92 P_BAD0 0 0.92 3063.0 61.0 1708.0 1128.0 0.642004 0.948696 0.0 ... 0.886952 0.051304 0.703188 0.019526 0.775934 0.922537 0.845073 0.854121 0.269958 0.296812
93 P_BAD0 0 0.93 2985.0 55.0 1786.0 1134.0 0.625655 0.953743 0.0 ... 0.881519 0.046257 0.691107 0.018092 0.764307 0.922537 0.845073 0.854121 0.269958 0.308893
94 P_BAD0 0 0.94 2901.0 49.0 1870.0 1140.0 0.608049 0.958789 0.0 ... 0.875324 0.041211 0.678020 0.016610 0.751457 0.922537 0.845073 0.854121 0.269958 0.321980
95 P_BAD0 0 0.95 2710.0 38.0 2061.0 1151.0 0.568015 0.968040 0.0 ... 0.859608 0.031960 0.647819 0.013828 0.720841 0.922537 0.845073 0.854121 0.269958 0.352181
96 P_BAD0 0 0.96 2451.0 25.0 2320.0 1164.0 0.513729 0.978974 0.0 ... 0.835094 0.021026 0.606544 0.010097 0.676418 0.922537 0.845073 0.854121 0.269958 0.393456
97 P_BAD0 0 0.97 1966.0 8.0 2805.0 1181.0 0.412073 0.993272 0.0 ... 0.776032 0.006728 0.528020 0.004053 0.582950 0.922537 0.845073 0.854121 0.269958 0.471980
98 P_BAD0 0 0.98 1882.0 6.0 2889.0 1183.0 0.394467 0.994954 0.0 ... 0.763613 0.005046 0.514262 0.003178 0.565250 0.922537 0.845073 0.854121 0.269958 0.485738
99 P_BAD0 0 0.99 1631.0 1.0 3140.0 1188.0 0.341857 0.999159 0.0 ... 0.721745 0.000841 0.472987 0.000613 0.509449 0.922537 0.845073 0.854121 0.269958 0.527013

100 rows × 21 columns


§ FitStat_1_DecisionTree
Fit Statistics for _AUTOTUNE_DEFAULT_SCORE_TABLE_
NOBS ASE DIV RASE MCE MCLL
0 5960.0 0.081785 5960.0 0.285982 0.114765 0.262608

§ TunerInfo_1_DecisionTree
Tuner Information
Parameter Value
0 Model Type Decision Tree
1 Tuner Objective Function Misclassification
2 Search Method GRID
3 Number of Grid Points 6
4 Maximum Tuning Time in Seconds 36000
5 Validation Type Cross-Validation
6 Num Folds in Cross-Validation 5
7 Log Level 0
8 Seed 726654185
9 Number of Parallel Evaluations 4
10 Number of Workers per Subsession 0

§ TunerResults_1_DecisionTree
Tuner Results
Evaluation MAXLEVEL NBINS CRIT MeanConseqError EvaluationTime
0 0 11 20 gainRatio 0.140101 0.526701
1 4 15 100 gain 0.115100 1.270005
2 2 15 100 gainRatio 0.119799 1.488537
3 3 10 100 gainRatio 0.122987 1.228688
4 1 10 100 gain 0.129321 0.680179
5 5 5 100 gain 0.138948 0.668237
6 6 5 100 gainRatio 0.149161 0.405213

§ IterationHistory_1_DecisionTree
Tuner Iteration History
Iteration Evaluations Best_obj Time_sec
0 0 1 0.140101 0.526701
1 1 7 0.115100 2.161789

§ EvaluationHistory_1_DecisionTree
Tuner Evaluation History
Evaluation Iteration MAXLEVEL NBINS CRIT MeanConseqError EvaluationTime
0 0 0 11 20 gainRatio 0.140101 0.526701
1 1 1 10 100 gain 0.129321 0.680179
2 2 1 15 100 gainRatio 0.119799 1.488537
3 3 1 10 100 gainRatio 0.122987 1.228688
4 4 1 15 100 gain 0.115100 1.270005
5 5 1 5 100 gain 0.138948 0.668237
6 6 1 5 100 gainRatio 0.149161 0.405213

§ BestConfiguration_1_DecisionTree
Best Configuration
Parameter Name Value
0 Evaluation Evaluation 4
1 Maximum Tree Levels MAXLEVEL 15
2 Maximum Bins NBINS 100
3 Criterion CRIT gain
4 Misclassification Objective 0.1151004199

§ TunerSummary_1_DecisionTree
Tuner Summary
Parameter Value
0 Initial Configuration Objective Value 0.140101
1 Best Configuration Objective Value 0.115100
2 Worst Configuration Objective Value 0.149161
3 Initial Configuration Evaluation Time in Seconds 0.526701
4 Best Configuration Evaluation Time in Seconds 1.126155
5 Number of Improved Configurations 3.000000
6 Number of Evaluated Configurations 7.000000
7 Total Tuning Time in Seconds 2.308218
8 Parallel Tuning Speedup 2.527624

§ TunerTiming_1_DecisionTree
Tuner Task Timing
Task Time_sec Time_percent
0 Model Training 3.962936 67.924703
1 Model Scoring 1.382820 23.701521
2 Total Objective Evaluations 5.348993 91.681710
3 Tuner 0.485315 8.318290
4 Total CPU Time 5.834308 100.000000

§ HyperparameterImportance_1_DecisionTree
Hyperparameter Importance
Hyperparameter RelImportance
0 MAXLEVEL 1.000000
1 CRIT 0.066046
2 NBINS 0.000000

§ ModelInfo_2_GradBoost
Gradient Boosting Tree for __TEMP_FEATURE_MACHINE_CASOUT___AUTOTUNE_11FEB2020:09:55:51
Descr Value
0 Number of Trees 1.500000e+02
1 Distribution 2.000000e+00
2 Learning Rate 1.000000e-01
3 Subsampling Rate 6.000000e-01
4 Number of Selected Variables (M) 4.000000e+00
5 Number of Bins 7.700000e+01
6 Number of Variables 4.000000e+00
7 Max Number of Tree Nodes 1.190000e+02
8 Min Number of Tree Nodes 5.700000e+01
9 Max Number of Branches 2.000000e+00
10 Min Number of Branches 2.000000e+00
11 Max Number of Levels 7.000000e+00
12 Min Number of Levels 7.000000e+00
13 Max Number of Leaves 6.000000e+01
14 Min Number of Leaves 2.900000e+01
15 Maximum Size of Leaves 2.054000e+03
16 Minimum Size of Leaves 5.000000e+00
17 Random Number Seed 7.266544e+08
18 Lasso (L1) penalty 0.000000e+00
19 Ridge (L2) penalty 0.000000e+00
20 Actual Number of Trees 9.500000e+01
21 Average number of Leaves 4.853684e+01
22 Early stopping stagnation 4.000000e+00
23 Early stopping threshold 0.000000e+00
24 Early stopping threshold iterations 0.000000e+00
25 Early stopping tolerance 0.000000e+00

§ EvalMetricInfo_2_GradBoost
Progress Metric
0 1.0 0.199497
1 2.0 0.199497
2 3.0 0.199497
3 4.0 0.199497
4 5.0 0.176174
5 6.0 0.153356
6 7.0 0.149832
7 8.0 0.138758
8 9.0 0.133557
9 10.0 0.130872
10 11.0 0.129027
11 12.0 0.128859
12 13.0 0.128020
13 14.0 0.127181
14 15.0 0.126678
15 16.0 0.123826
16 17.0 0.123154
17 18.0 0.122148
18 19.0 0.121644
19 20.0 0.120805
20 21.0 0.120302
21 22.0 0.120470
22 23.0 0.119128
23 24.0 0.118792
24 25.0 0.118960
25 26.0 0.119128
26 27.0 0.118289
27 28.0 0.118792
28 29.0 0.119799
29 30.0 0.118960
... ... ...
65 66.0 0.112248
66 67.0 0.111745
67 68.0 0.111745
68 69.0 0.112081
69 70.0 0.111242
70 71.0 0.111242
71 72.0 0.111074
72 73.0 0.111745
73 74.0 0.111074
74 75.0 0.110738
75 76.0 0.109228
76 77.0 0.109396
77 78.0 0.108893
78 79.0 0.109396
79 80.0 0.109396
80 81.0 0.109564
81 82.0 0.108725
82 83.0 0.108893
83 84.0 0.109060
84 85.0 0.108893
85 86.0 0.108893
86 87.0 0.108557
87 88.0 0.108557
88 89.0 0.108389
89 90.0 0.108557
90 91.0 0.108557
91 92.0 0.107383
92 93.0 0.107718
93 94.0 0.107383
94 95.0 0.107215

95 rows × 2 columns


§ ScoreInfo_2_GradBoost
Descr Value
0 Number of Observations Read 5960
1 Number of Observations Used 5960
2 Misclassification Error (%) 10.72147651

§ ErrorMetricInfo_2_GradBoost
TreeID Trees NLeaves MCR LogLoss ASE RASE MAXAE
0 0.0 1.0 47.0 0.199497 0.458278 0.145427 0.381349 0.819707
1 1.0 2.0 99.0 0.199497 0.429789 0.134712 0.367032 0.836517
2 2.0 3.0 152.0 0.199497 0.408185 0.126404 0.355533 0.851858
3 3.0 4.0 211.0 0.199497 0.390526 0.119608 0.345844 0.863870
4 4.0 5.0 263.0 0.176174 0.376211 0.114106 0.337796 0.875788
5 5.0 6.0 322.0 0.153356 0.364140 0.109631 0.331106 0.887132
6 6.0 7.0 377.0 0.149832 0.353821 0.105886 0.325401 0.895821
7 7.0 8.0 431.0 0.138758 0.345072 0.102787 0.320604 0.904793
8 8.0 9.0 485.0 0.133557 0.337656 0.100209 0.316557 0.912727
9 9.0 10.0 541.0 0.130872 0.331382 0.098148 0.313286 0.919855
10 10.0 11.0 597.0 0.129027 0.326246 0.096513 0.310666 0.925717
11 11.0 12.0 644.0 0.128859 0.321327 0.094965 0.308164 0.932808
12 12.0 13.0 697.0 0.128020 0.317239 0.093821 0.306303 0.936113
13 13.0 14.0 753.0 0.127181 0.313479 0.092764 0.304572 0.940550
14 14.0 15.0 805.0 0.126678 0.309958 0.091747 0.302898 0.945170
15 15.0 16.0 860.0 0.123826 0.307258 0.091024 0.301702 0.950417
16 16.0 17.0 908.0 0.123154 0.304862 0.090389 0.300647 0.955154
17 17.0 18.0 955.0 0.122148 0.302712 0.089834 0.299723 0.959430
18 18.0 19.0 1005.0 0.121644 0.300429 0.089242 0.298734 0.963297
19 19.0 20.0 1053.0 0.120805 0.298622 0.088826 0.298037 0.964853
20 20.0 21.0 1104.0 0.120302 0.297092 0.088430 0.297371 0.966991
21 21.0 22.0 1149.0 0.120470 0.295923 0.088186 0.296961 0.969630
22 22.0 23.0 1209.0 0.119128 0.294346 0.087707 0.296154 0.971256
23 23.0 24.0 1261.0 0.118792 0.292845 0.087329 0.295515 0.973038
24 24.0 25.0 1311.0 0.118960 0.291433 0.086961 0.294891 0.974278
25 25.0 26.0 1350.0 0.119128 0.290367 0.086703 0.294453 0.974644
26 26.0 27.0 1398.0 0.118289 0.289288 0.086450 0.294024 0.975850
27 27.0 28.0 1447.0 0.118792 0.288402 0.086284 0.293742 0.975689
28 28.0 29.0 1500.0 0.119799 0.287382 0.086082 0.293397 0.975563
29 29.0 30.0 1548.0 0.118960 0.286730 0.085935 0.293147 0.976641
... ... ... ... ... ... ... ... ...
65 65.0 66.0 3270.0 0.112248 0.266318 0.080674 0.284032 0.990752
66 66.0 67.0 3316.0 0.111745 0.266048 0.080593 0.283889 0.991643
67 67.0 68.0 3366.0 0.111745 0.265629 0.080465 0.283663 0.991453
68 68.0 69.0 3417.0 0.112081 0.265382 0.080431 0.283604 0.991491
69 69.0 70.0 3461.0 0.111242 0.265173 0.080362 0.283481 0.991036
70 70.0 71.0 3512.0 0.111242 0.264849 0.080279 0.283335 0.991385
71 71.0 72.0 3560.0 0.111074 0.264615 0.080229 0.283248 0.991280
72 72.0 73.0 3594.0 0.111745 0.264370 0.080155 0.283117 0.991323
73 73.0 74.0 3647.0 0.111074 0.263976 0.080028 0.282892 0.992165
74 74.0 75.0 3692.0 0.110738 0.263605 0.079898 0.282662 0.991208
75 75.0 76.0 3744.0 0.109228 0.263072 0.079740 0.282382 0.992044
76 76.0 77.0 3784.0 0.109396 0.262907 0.079695 0.282303 0.992801
77 77.0 78.0 3829.0 0.108893 0.262714 0.079631 0.282190 0.992319
78 78.0 79.0 3882.0 0.109396 0.262395 0.079533 0.282017 0.992448
79 79.0 80.0 3935.0 0.109396 0.262050 0.079434 0.281841 0.992526
80 80.0 81.0 3991.0 0.109564 0.261739 0.079324 0.281646 0.992363
81 81.0 82.0 4036.0 0.108725 0.261424 0.079223 0.281466 0.992340
82 82.0 83.0 4084.0 0.108893 0.261186 0.079162 0.281357 0.992201
83 83.0 84.0 4126.0 0.109060 0.260849 0.079060 0.281176 0.992949
84 84.0 85.0 4157.0 0.108893 0.260677 0.079007 0.281082 0.993063
85 85.0 86.0 4191.0 0.108893 0.260500 0.078969 0.281014 0.992918
86 86.0 87.0 4238.0 0.108557 0.260328 0.078914 0.280916 0.992089
87 87.0 88.0 4297.0 0.108557 0.260030 0.078824 0.280756 0.991859
88 88.0 89.0 4342.0 0.108389 0.259787 0.078738 0.280603 0.991826
89 89.0 90.0 4383.0 0.108557 0.259557 0.078666 0.280474 0.991302
90 90.0 91.0 4423.0 0.108557 0.259309 0.078579 0.280320 0.991467
91 91.0 92.0 4477.0 0.107383 0.258983 0.078481 0.280144 0.991545
92 92.0 93.0 4520.0 0.107718 0.258736 0.078415 0.280027 0.991751
93 93.0 94.0 4567.0 0.107383 0.258461 0.078372 0.279949 0.991910
94 94.0 95.0 4611.0 0.107215 0.258225 0.078295 0.279812 0.991582

95 rows × 8 columns


§ EncodedName_2_GradBoost
LEVNAME LEVINDEX VARNAME
0 1 0 P_BAD1
1 0 1 P_BAD0

§ EncodedTargetName_2_GradBoost
LEVNAME LEVINDEX VARNAME
0 0 I_BAD

§ ROCInfo_2_GradBoost
ROC Information for _AUTOTUNE_DEFAULT_SCORE_TABLE_
Variable Event CutOff TP FP FN TN Sensitivity Specificity KS ... F_HALF FPR ACC FDR F1 C Gini Gamma Tau MISCEVENT
0 P_BAD0 0 0.00 4771.0 1189.0 0.0 0.0 1.000000 0.000000 0.0 ... 0.833770 1.000000 0.800503 0.199497 0.889200 0.926975 0.853951 0.862263 0.272794 0.199497
1 P_BAD0 0 0.01 4771.0 1178.0 0.0 11.0 1.000000 0.009251 0.0 ... 0.835054 0.990749 0.802349 0.198016 0.890112 0.926975 0.853951 0.862263 0.272794 0.197651
2 P_BAD0 0 0.02 4771.0 1170.0 0.0 19.0 1.000000 0.015980 0.0 ... 0.835991 0.984020 0.803691 0.196937 0.890777 0.926975 0.853951 0.862263 0.272794 0.196309
3 P_BAD0 0 0.03 4770.0 1123.0 1.0 66.0 0.999790 0.055509 0.0 ... 0.841478 0.944491 0.811409 0.190565 0.894599 0.926975 0.853951 0.862263 0.272794 0.188591
4 P_BAD0 0 0.04 4770.0 1098.0 1.0 91.0 0.999790 0.076535 0.0 ... 0.844457 0.923465 0.815604 0.187117 0.896701 0.926975 0.853951 0.862263 0.272794 0.184396
5 P_BAD0 0 0.05 4770.0 1072.0 1.0 117.0 0.999790 0.098402 0.0 ... 0.847578 0.901598 0.819966 0.183499 0.898898 0.926975 0.853951 0.862263 0.272794 0.180034
6 P_BAD0 0 0.06 4769.0 1043.0 2.0 146.0 0.999581 0.122792 0.0 ... 0.851030 0.877208 0.824664 0.179456 0.901257 0.926975 0.853951 0.862263 0.272794 0.175336
7 P_BAD0 0 0.07 4768.0 1018.0 3.0 171.0 0.999371 0.143818 0.0 ... 0.854021 0.856182 0.828691 0.175942 0.903287 0.926975 0.853951 0.862263 0.272794 0.171309
8 P_BAD0 0 0.08 4768.0 994.0 3.0 195.0 0.999371 0.164003 0.0 ... 0.856968 0.835997 0.832718 0.172510 0.905345 0.926975 0.853951 0.862263 0.272794 0.167282
9 P_BAD0 0 0.09 4766.0 978.0 5.0 211.0 0.998952 0.177460 0.0 ... 0.858832 0.822540 0.835067 0.170265 0.906515 0.926975 0.853951 0.862263 0.272794 0.164933
10 P_BAD0 0 0.10 4766.0 966.0 5.0 223.0 0.998952 0.187553 0.0 ... 0.860320 0.812447 0.837081 0.168528 0.907550 0.926975 0.853951 0.862263 0.272794 0.162919
11 P_BAD0 0 0.11 4765.0 952.0 6.0 237.0 0.998742 0.199327 0.0 ... 0.862007 0.800673 0.839262 0.166521 0.908658 0.926975 0.853951 0.862263 0.272794 0.160738
12 P_BAD0 0 0.12 4764.0 939.0 7.0 250.0 0.998533 0.210261 0.0 ... 0.863575 0.789739 0.841275 0.164650 0.909681 0.926975 0.853951 0.862263 0.272794 0.158725
13 P_BAD0 0 0.13 4764.0 933.0 7.0 256.0 0.998533 0.215307 0.0 ... 0.864327 0.784693 0.842282 0.163770 0.910203 0.926975 0.853951 0.862263 0.272794 0.157718
14 P_BAD0 0 0.14 4763.0 911.0 8.0 278.0 0.998323 0.233810 0.0 ... 0.867040 0.766190 0.845805 0.160557 0.912015 0.926975 0.853951 0.862263 0.272794 0.154195
15 P_BAD0 0 0.15 4762.0 902.0 9.0 287.0 0.998114 0.241379 0.0 ... 0.868123 0.758621 0.847148 0.159251 0.912698 0.926975 0.853951 0.862263 0.272794 0.152852
16 P_BAD0 0 0.16 4755.0 853.0 16.0 336.0 0.996646 0.282590 0.0 ... 0.873984 0.717410 0.854195 0.152104 0.916273 0.926975 0.853951 0.862263 0.272794 0.145805
17 P_BAD0 0 0.17 4755.0 844.0 16.0 345.0 0.996646 0.290160 0.0 ... 0.875143 0.709840 0.855705 0.150741 0.917068 0.926975 0.853951 0.862263 0.272794 0.144295
18 P_BAD0 0 0.18 4754.0 836.0 17.0 353.0 0.996437 0.296888 0.0 ... 0.876120 0.703112 0.856879 0.149553 0.917672 0.926975 0.853951 0.862263 0.272794 0.143121
19 P_BAD0 0 0.19 4750.0 819.0 21.0 370.0 0.995598 0.311186 0.0 ... 0.878101 0.688814 0.859060 0.147064 0.918762 0.926975 0.853951 0.862263 0.272794 0.140940
20 P_BAD0 0 0.20 4748.0 807.0 23.0 382.0 0.995179 0.321278 0.0 ... 0.879552 0.678722 0.860738 0.145275 0.919620 0.926975 0.853951 0.862263 0.272794 0.139262
21 P_BAD0 0 0.21 4745.0 787.0 26.0 402.0 0.994550 0.338099 0.0 ... 0.882003 0.661901 0.863591 0.142263 0.921091 0.926975 0.853951 0.862263 0.272794 0.136409
22 P_BAD0 0 0.22 4741.0 766.0 30.0 423.0 0.993712 0.355761 0.0 ... 0.884548 0.644239 0.866443 0.139096 0.922553 0.926975 0.853951 0.862263 0.272794 0.133557
23 P_BAD0 0 0.23 4740.0 763.0 31.0 426.0 0.993502 0.358284 0.0 ... 0.884890 0.641716 0.866779 0.138652 0.922718 0.926975 0.853951 0.862263 0.272794 0.133221
24 P_BAD0 0 0.24 4740.0 760.0 31.0 429.0 0.993502 0.360807 0.0 ... 0.885286 0.639193 0.867282 0.138182 0.922987 0.926975 0.853951 0.862263 0.272794 0.132718
25 P_BAD0 0 0.25 4739.0 758.0 32.0 431.0 0.993293 0.362489 0.0 ... 0.885496 0.637511 0.867450 0.137893 0.923062 0.926975 0.853951 0.862263 0.272794 0.132550
26 P_BAD0 0 0.26 4738.0 750.0 33.0 439.0 0.993083 0.369218 0.0 ... 0.886502 0.630782 0.868624 0.136662 0.923677 0.926975 0.853951 0.862263 0.272794 0.131376
27 P_BAD0 0 0.27 4734.0 736.0 37.0 453.0 0.992245 0.380992 0.0 ... 0.888147 0.619008 0.870302 0.134552 0.924519 0.926975 0.853951 0.862263 0.272794 0.129698
28 P_BAD0 0 0.28 4733.0 735.0 38.0 454.0 0.992035 0.381833 0.0 ... 0.888226 0.618167 0.870302 0.134418 0.924504 0.926975 0.853951 0.862263 0.272794 0.129698
29 P_BAD0 0 0.29 4719.0 697.0 52.0 492.0 0.989101 0.413793 0.0 ... 0.892567 0.586207 0.874329 0.128693 0.926475 0.926975 0.853951 0.862263 0.272794 0.125671
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
70 P_BAD0 0 0.70 4270.0 243.0 501.0 946.0 0.894991 0.795627 0.0 ... 0.935460 0.204373 0.875168 0.053844 0.919862 0.926975 0.853951 0.862263 0.272794 0.124832
71 P_BAD0 0 0.71 4213.0 223.0 558.0 966.0 0.883043 0.812447 0.0 ... 0.935598 0.187553 0.868960 0.050271 0.915173 0.926975 0.853951 0.862263 0.272794 0.131040
72 P_BAD0 0 0.72 4195.0 217.0 576.0 972.0 0.879271 0.817494 0.0 ... 0.935590 0.182506 0.866946 0.049184 0.913645 0.926975 0.853951 0.862263 0.272794 0.133054
73 P_BAD0 0 0.73 4180.0 212.0 591.0 977.0 0.876127 0.821699 0.0 ... 0.935584 0.178301 0.865268 0.048270 0.912365 0.926975 0.853951 0.862263 0.272794 0.134732
74 P_BAD0 0 0.74 4163.0 207.0 608.0 982.0 0.872563 0.825904 0.0 ... 0.935464 0.174096 0.863255 0.047368 0.910841 0.926975 0.853951 0.862263 0.272794 0.136745
75 P_BAD0 0 0.75 4138.0 200.0 633.0 989.0 0.867323 0.831791 0.0 ... 0.935226 0.168209 0.860235 0.046104 0.908552 0.926975 0.853951 0.862263 0.272794 0.139765
76 P_BAD0 0 0.76 4125.0 195.0 646.0 994.0 0.864599 0.835997 0.0 ... 0.935332 0.164003 0.858893 0.045139 0.907491 0.926975 0.853951 0.862263 0.272794 0.141107
77 P_BAD0 0 0.77 4113.0 193.0 658.0 996.0 0.862083 0.837679 0.0 ... 0.934985 0.162321 0.857215 0.044821 0.906247 0.926975 0.853951 0.862263 0.272794 0.142785
78 P_BAD0 0 0.78 4107.0 190.0 664.0 999.0 0.860826 0.840202 0.0 ... 0.935152 0.159798 0.856711 0.044217 0.905823 0.926975 0.853951 0.862263 0.272794 0.143289
79 P_BAD0 0 0.79 4100.0 186.0 671.0 1003.0 0.859359 0.843566 0.0 ... 0.935432 0.156434 0.856208 0.043397 0.905377 0.926975 0.853951 0.862263 0.272794 0.143792
80 P_BAD0 0 0.80 3991.0 158.0 780.0 1031.0 0.836512 0.867115 1.0 ... 0.933917 0.132885 0.842617 0.038081 0.894843 0.926975 0.853951 0.862263 0.272794 0.157383
81 P_BAD0 0 0.81 3971.0 155.0 800.0 1034.0 0.832320 0.869638 0.0 ... 0.933255 0.130362 0.839765 0.037567 0.892660 0.926975 0.853951 0.862263 0.272794 0.160235
82 P_BAD0 0 0.82 3950.0 150.0 821.0 1039.0 0.827919 0.873844 0.0 ... 0.932880 0.126156 0.837081 0.036585 0.890542 0.926975 0.853951 0.862263 0.272794 0.162919
83 P_BAD0 0 0.83 3921.0 143.0 850.0 1046.0 0.821840 0.879731 0.0 ... 0.932373 0.120269 0.833389 0.035187 0.887606 0.926975 0.853951 0.862263 0.272794 0.166611
84 P_BAD0 0 0.84 3896.0 139.0 875.0 1050.0 0.816600 0.883095 0.0 ... 0.931567 0.116905 0.829866 0.034449 0.884851 0.926975 0.853951 0.862263 0.272794 0.170134
85 P_BAD0 0 0.85 3856.0 132.0 915.0 1057.0 0.808216 0.888982 0.0 ... 0.930367 0.111018 0.824329 0.033099 0.880466 0.926975 0.853951 0.862263 0.272794 0.175671
86 P_BAD0 0 0.86 3807.0 123.0 964.0 1066.0 0.797946 0.896552 0.0 ... 0.928944 0.103448 0.817617 0.031298 0.875072 0.926975 0.853951 0.862263 0.272794 0.182383
87 P_BAD0 0 0.87 3727.0 111.0 1044.0 1078.0 0.781178 0.906644 0.0 ... 0.926055 0.093356 0.806208 0.028921 0.865838 0.926975 0.853951 0.862263 0.272794 0.193792
88 P_BAD0 0 0.88 3678.0 106.0 1093.0 1083.0 0.770908 0.910849 0.0 ... 0.923796 0.089151 0.798826 0.028013 0.859848 0.926975 0.853951 0.862263 0.272794 0.201174
89 P_BAD0 0 0.89 3637.0 103.0 1134.0 1086.0 0.762314 0.913373 0.0 ... 0.921646 0.086627 0.792450 0.027540 0.854659 0.926975 0.853951 0.862263 0.272794 0.207550
90 P_BAD0 0 0.90 3606.0 102.0 1165.0 1087.0 0.755816 0.914214 0.0 ... 0.919757 0.085786 0.787416 0.027508 0.850572 0.926975 0.853951 0.862263 0.272794 0.212584
91 P_BAD0 0 0.91 3527.0 95.0 1244.0 1094.0 0.739258 0.920101 0.0 ... 0.915676 0.079899 0.775336 0.026229 0.840462 0.926975 0.853951 0.862263 0.272794 0.224664
92 P_BAD0 0 0.92 3383.0 86.0 1388.0 1103.0 0.709076 0.927670 0.0 ... 0.907116 0.072330 0.752685 0.024791 0.821117 0.926975 0.853951 0.862263 0.272794 0.247315
93 P_BAD0 0 0.93 3255.0 78.0 1516.0 1111.0 0.682247 0.934399 0.0 ... 0.899022 0.065601 0.732550 0.023402 0.803307 0.926975 0.853951 0.862263 0.272794 0.267450
94 P_BAD0 0 0.94 3020.0 60.0 1751.0 1129.0 0.632991 0.949537 0.0 ... 0.883506 0.050463 0.696141 0.019481 0.769329 0.926975 0.853951 0.862263 0.272794 0.303859
95 P_BAD0 0 0.95 2558.0 33.0 2213.0 1156.0 0.536156 0.972246 0.0 ... 0.845061 0.027754 0.623154 0.012736 0.694920 0.926975 0.853951 0.862263 0.272794 0.376846
96 P_BAD0 0 0.96 2374.0 24.0 2397.0 1165.0 0.497590 0.979815 0.0 ... 0.826429 0.020185 0.593792 0.010008 0.662296 0.926975 0.853951 0.862263 0.272794 0.406208
97 P_BAD0 0 0.97 1969.0 14.0 2802.0 1175.0 0.412702 0.988225 0.0 ... 0.775014 0.011775 0.527517 0.007060 0.583062 0.926975 0.853951 0.862263 0.272794 0.472483
98 P_BAD0 0 0.98 1257.0 1.0 3514.0 1188.0 0.263467 0.999159 0.0 ... 0.641130 0.000841 0.410235 0.000795 0.416985 0.926975 0.853951 0.862263 0.272794 0.589765
99 P_BAD0 0 0.99 869.0 1.0 3902.0 1188.0 0.182142 0.999159 0.0 ... 0.526603 0.000841 0.345134 0.001149 0.308101 0.926975 0.853951 0.862263 0.272794 0.654866

100 rows × 21 columns


§ FitStat_2_GradBoost
Fit Statistics for _AUTOTUNE_DEFAULT_SCORE_TABLE_
NOBS ASE DIV RASE MCE MCLL
0 5960.0 0.078295 5960.0 0.279812 0.107215 0.258225

§ TunerInfo_2_GradBoost
Tuner Information
Parameter Value
0 Model Type Gradient Boosting Tree
1 Tuner Objective Function Misclassification
2 Search Method GRID
3 Number of Grid Points 16
4 Maximum Tuning Time in Seconds 36000
5 Validation Type Cross-Validation
6 Num Folds in Cross-Validation 5
7 Log Level 0
8 Seed 726654418
9 Number of Parallel Evaluations 4
10 Number of Workers per Subsession 0

§ TunerResults_2_GradBoost
Tuner Results
Evaluation M LEARNINGRATE SUBSAMPLERATE LASSO RIDGE NBINS MAXLEVEL MeanConseqError EvaluationTime
0 0 4 0.10 0.5 0.0 1.0 50 5 0.186242 0.924383
1 6 4 0.10 0.6 0.0 0.0 77 7 0.121141 11.662343
2 2 4 0.10 0.8 0.0 0.0 77 7 0.122148 10.705619
3 5 4 0.10 0.8 0.5 0.0 77 7 0.122987 9.525259
4 10 4 0.10 0.6 0.5 0.0 77 7 0.134009 5.635216
5 16 4 0.10 0.6 0.0 0.0 77 5 0.136934 2.912493
6 13 4 0.10 0.8 0.0 0.0 77 5 0.145796 3.245351
7 9 4 0.10 0.8 0.5 0.0 77 5 0.174442 3.087513
8 14 4 0.05 0.8 0.5 0.0 77 5 0.199216 0.955785
9 7 4 0.05 0.8 0.0 0.0 77 5 0.199362 1.859044
10 12 4 0.10 0.6 0.5 0.0 77 5 0.199385 1.039984

§ IterationHistory_2_GradBoost
Tuner Iteration History
Iteration Evaluations Best_obj Time_sec
0 0 1 0.186242 0.924383
1 1 17 0.121141 17.755302

§ EvaluationHistory_2_GradBoost
Tuner Evaluation History
Evaluation Iteration M LEARNINGRATE SUBSAMPLERATE LASSO RIDGE NBINS MAXLEVEL MeanConseqError EvaluationTime
0 0 0 4 0.10 0.5 0.0 1.0 50 5 0.186242 0.924383
1 1 1 4 0.05 0.6 0.5 0.0 77 5 0.199430 1.235530
2 2 1 4 0.10 0.8 0.0 0.0 77 7 0.122148 10.705619
3 3 1 4 0.05 0.6 0.0 0.0 77 7 0.199664 2.768309
4 4 1 4 0.05 0.6 0.0 0.0 77 5 0.199664 2.254907
5 5 1 4 0.10 0.8 0.5 0.0 77 7 0.122987 9.525259
6 6 1 4 0.10 0.6 0.0 0.0 77 7 0.121141 11.662343
7 7 1 4 0.05 0.8 0.0 0.0 77 5 0.199362 1.859044
8 8 1 4 0.05 0.6 0.5 0.0 77 7 0.199497 3.138668
9 9 1 4 0.10 0.8 0.5 0.0 77 5 0.174442 3.087513
10 10 1 4 0.10 0.6 0.5 0.0 77 7 0.134009 5.635216
11 11 1 4 0.05 0.8 0.0 0.0 77 7 0.199609 1.252530
12 12 1 4 0.10 0.6 0.5 0.0 77 5 0.199385 1.039984
13 13 1 4 0.10 0.8 0.0 0.0 77 5 0.145796 3.245351
14 14 1 4 0.05 0.8 0.5 0.0 77 5 0.199216 0.955785
15 15 1 4 0.05 0.8 0.5 0.0 77 7 0.199553 1.230367
16 16 1 4 0.10 0.6 0.0 0.0 77 5 0.136934 2.912493

§ BestConfiguration_2_GradBoost
Best Configuration
Parameter Name Value
0 Evaluation Evaluation 6
1 Number of Variables to Try M 4
2 Learning Rate LEARNINGRATE 0.1
3 Sampling Rate SUBSAMPLERATE 0.6
4 Lasso LASSO 0
5 Ridge RIDGE 0
6 Number of Bins NBINS 77
7 Maximum Tree Levels MAXLEVEL 7
8 Misclassification Objective 0.1211409396

§ TunerSummary_2_GradBoost
Tuner Summary
Parameter Value
0 Initial Configuration Objective Value 0.186242
1 Best Configuration Objective Value 0.121141
2 Worst Configuration Objective Value 0.199664
3 Initial Configuration Evaluation Time in Seconds 0.924383
4 Best Configuration Evaluation Time in Seconds 11.662332
5 Number of Improved Configurations 2.000000
6 Number of Evaluated Configurations 17.000000
7 Total Tuning Time in Seconds 19.482536
8 Parallel Tuning Speedup 3.315336

§ TunerTiming_2_GradBoost
Tuner Task Timing
Task Time_sec Time_percent
0 Model Training 59.722464 92.462293
1 Model Scoring 4.346670 6.729513
2 Total Objective Evaluations 64.076689 99.203503
3 Tuner 0.514467 0.796497
4 Total CPU Time 64.591156 100.000000

§ HyperparameterImportance_2_GradBoost
Hyperparameter Importance
Hyperparameter RelImportance
0 LEARNINGRATE 1.000000
1 MAXLEVEL 0.157487
2 LASSO 0.045519
3 SUBSAMPLERATE 0.008810
4 M 0.000000
5 RIDGE 0.000000
6 NBINS 0.000000

§ ModelInfo_1_GradBoost
Gradient Boosting Tree for __TEMP_FEATURE_MACHINE_CASOUT___AUTOTUNE_11FEB2020:09:56:13
Descr Value
0 Number of Trees 1.500000e+02
1 Distribution 2.000000e+00
2 Learning Rate 1.000000e-01
3 Subsampling Rate 6.000000e-01
4 Number of Selected Variables (M) 4.000000e+00
5 Number of Bins 7.700000e+01
6 Number of Variables 4.000000e+00
7 Max Number of Tree Nodes 1.070000e+02
8 Min Number of Tree Nodes 4.300000e+01
9 Max Number of Branches 2.000000e+00
10 Min Number of Branches 2.000000e+00
11 Max Number of Levels 7.000000e+00
12 Min Number of Levels 7.000000e+00
13 Max Number of Leaves 5.400000e+01
14 Min Number of Leaves 2.200000e+01
15 Maximum Size of Leaves 3.163000e+03
16 Minimum Size of Leaves 5.000000e+00
17 Random Number Seed 7.266564e+08
18 Lasso (L1) penalty 0.000000e+00
19 Ridge (L2) penalty 0.000000e+00
20 Actual Number of Trees 9.200000e+01
21 Average number of Leaves 4.116304e+01
22 Early stopping stagnation 4.000000e+00
23 Early stopping threshold 0.000000e+00
24 Early stopping threshold iterations 0.000000e+00
25 Early stopping tolerance 0.000000e+00

§ EvalMetricInfo_1_GradBoost
Progress Metric
0 1.0 0.199497
1 2.0 0.199497
2 3.0 0.199497
3 4.0 0.197483
4 5.0 0.165436
5 6.0 0.152181
6 7.0 0.139933
7 8.0 0.136242
8 9.0 0.131879
9 10.0 0.130201
10 11.0 0.128188
11 12.0 0.126846
12 13.0 0.126007
13 14.0 0.125671
14 15.0 0.125000
15 16.0 0.123490
16 17.0 0.120134
17 18.0 0.119463
18 19.0 0.118960
19 20.0 0.117450
20 21.0 0.118289
21 22.0 0.116946
22 23.0 0.116779
23 24.0 0.116946
24 25.0 0.116443
25 26.0 0.115940
26 27.0 0.116443
27 28.0 0.116443
28 29.0 0.115604
29 30.0 0.115604
... ... ...
62 63.0 0.111577
63 64.0 0.111242
64 65.0 0.111577
65 66.0 0.111577
66 67.0 0.111242
67 68.0 0.111074
68 69.0 0.111074
69 70.0 0.111074
70 71.0 0.110906
71 72.0 0.110067
72 73.0 0.110067
73 74.0 0.109732
74 75.0 0.109396
75 76.0 0.109564
76 77.0 0.109396
77 78.0 0.109228
78 79.0 0.109396
79 80.0 0.108893
80 81.0 0.108389
81 82.0 0.107886
82 83.0 0.107550
83 84.0 0.108054
84 85.0 0.108054
85 86.0 0.107886
86 87.0 0.107886
87 88.0 0.107550
88 89.0 0.107718
89 90.0 0.107383
90 91.0 0.107047
91 92.0 0.106879

92 rows × 2 columns


§ ScoreInfo_1_GradBoost
Descr Value
0 Number of Observations Read 5960
1 Number of Observations Used 5960
2 Misclassification Error (%) 10.687919463

§ ErrorMetricInfo_1_GradBoost
TreeID Trees NLeaves MCR LogLoss ASE RASE MAXAE
0 0.0 1.0 44.0 0.199497 0.461408 0.146438 0.382672 0.819707
1 1.0 2.0 90.0 0.199497 0.433832 0.135944 0.368706 0.834834
2 2.0 3.0 131.0 0.199497 0.412887 0.127711 0.357367 0.850252
3 3.0 4.0 175.0 0.197483 0.396104 0.121037 0.347903 0.863593
4 4.0 5.0 218.0 0.165436 0.382247 0.115562 0.339944 0.875763
5 5.0 6.0 269.0 0.152181 0.370704 0.111132 0.333364 0.887209
6 6.0 7.0 322.0 0.139933 0.360500 0.107283 0.327541 0.896381
7 7.0 8.0 367.0 0.136242 0.351936 0.104032 0.322540 0.905584
8 8.0 9.0 413.0 0.131879 0.344754 0.101382 0.318406 0.913174
9 9.0 10.0 461.0 0.130201 0.338286 0.099128 0.314845 0.920931
10 10.0 11.0 506.0 0.128188 0.333539 0.097567 0.312358 0.927915
11 11.0 12.0 556.0 0.126846 0.328881 0.096072 0.309954 0.934397
12 12.0 13.0 600.0 0.126007 0.325340 0.094981 0.308190 0.940685
13 13.0 14.0 652.0 0.125671 0.321739 0.093844 0.306340 0.944621
14 14.0 15.0 691.0 0.125000 0.319247 0.093099 0.305122 0.949488
15 15.0 16.0 741.0 0.123490 0.316781 0.092398 0.303970 0.953769
16 16.0 17.0 791.0 0.120134 0.314452 0.091752 0.302905 0.957063
17 17.0 18.0 833.0 0.119463 0.312562 0.091262 0.302096 0.960383
18 18.0 19.0 880.0 0.118960 0.310749 0.090790 0.301315 0.962972
19 19.0 20.0 922.0 0.117450 0.309101 0.090350 0.300583 0.965429
20 20.0 21.0 968.0 0.118289 0.307656 0.090018 0.300030 0.968304
21 21.0 22.0 1015.0 0.116946 0.306337 0.089685 0.299474 0.970400
22 22.0 23.0 1061.0 0.116779 0.304994 0.089355 0.298923 0.972872
23 23.0 24.0 1110.0 0.116946 0.303827 0.089040 0.298395 0.974970
24 24.0 25.0 1144.0 0.116443 0.303005 0.088842 0.298064 0.976651
25 25.0 26.0 1194.0 0.115940 0.301961 0.088591 0.297642 0.978143
26 26.0 27.0 1242.0 0.116443 0.301117 0.088399 0.297320 0.978327
27 27.0 28.0 1292.0 0.116443 0.300058 0.088100 0.296817 0.978803
28 28.0 29.0 1337.0 0.115604 0.299002 0.087833 0.296366 0.980506
29 29.0 30.0 1362.0 0.115604 0.298444 0.087704 0.296149 0.981148
... ... ... ... ... ... ... ... ...
62 62.0 63.0 2632.0 0.111577 0.284163 0.083766 0.289423 0.992511
63 63.0 64.0 2654.0 0.111242 0.284055 0.083730 0.289361 0.992483
64 64.0 65.0 2692.0 0.111577 0.283762 0.083637 0.289200 0.992264
65 65.0 66.0 2730.0 0.111577 0.283420 0.083522 0.289001 0.992305
66 66.0 67.0 2763.0 0.111242 0.283227 0.083460 0.288895 0.992796
67 67.0 68.0 2815.0 0.111074 0.282907 0.083357 0.288716 0.992340
68 68.0 69.0 2845.0 0.111074 0.282621 0.083256 0.288541 0.992099
69 69.0 70.0 2898.0 0.111074 0.282254 0.083156 0.288368 0.992214
70 70.0 71.0 2939.0 0.110906 0.281983 0.083075 0.288228 0.992836
71 71.0 72.0 2984.0 0.110067 0.281640 0.082949 0.288009 0.992799
72 72.0 73.0 3026.0 0.110067 0.281176 0.082785 0.287723 0.992760
73 73.0 74.0 3068.0 0.109732 0.280772 0.082656 0.287499 0.992539
74 74.0 75.0 3103.0 0.109396 0.280478 0.082574 0.287357 0.992521
75 75.0 76.0 3132.0 0.109564 0.280179 0.082478 0.287189 0.992605
76 76.0 77.0 3172.0 0.109396 0.279915 0.082395 0.287045 0.992634
77 77.0 78.0 3219.0 0.109228 0.279401 0.082226 0.286750 0.992596
78 78.0 79.0 3257.0 0.109396 0.279225 0.082170 0.286652 0.992964
79 79.0 80.0 3297.0 0.108893 0.279044 0.082106 0.286541 0.992752
80 80.0 81.0 3341.0 0.108389 0.278749 0.081996 0.286349 0.992967
81 81.0 82.0 3376.0 0.107886 0.278520 0.081924 0.286224 0.993080
82 82.0 83.0 3405.0 0.107550 0.278331 0.081886 0.286158 0.993119
83 83.0 84.0 3450.0 0.108054 0.278146 0.081839 0.286075 0.993115
84 84.0 85.0 3488.0 0.108054 0.277901 0.081763 0.285942 0.992976
85 85.0 86.0 3532.0 0.107886 0.277665 0.081689 0.285813 0.993760
86 86.0 87.0 3579.0 0.107886 0.277380 0.081571 0.285607 0.993928
87 87.0 88.0 3620.0 0.107550 0.277217 0.081516 0.285510 0.994512
88 88.0 89.0 3658.0 0.107718 0.277042 0.081462 0.285416 0.994651
89 89.0 90.0 3691.0 0.107383 0.276871 0.081407 0.285319 0.993901
90 90.0 91.0 3733.0 0.107047 0.276669 0.081365 0.285246 0.993849
91 91.0 92.0 3787.0 0.106879 0.276219 0.081221 0.284993 0.993834

92 rows × 8 columns


§ EncodedName_1_GradBoost
LEVNAME LEVINDEX VARNAME
0 1 0 P_BAD1
1 0 1 P_BAD0

§ EncodedTargetName_1_GradBoost
LEVNAME LEVINDEX VARNAME
0 0 I_BAD

§ ROCInfo_1_GradBoost
ROC Information for _AUTOTUNE_DEFAULT_SCORE_TABLE_
Variable Event CutOff TP FP FN TN Sensitivity Specificity KS ... F_HALF FPR ACC FDR F1 C Gini Gamma Tau MISCEVENT
0 P_BAD0 0 0.00 4771.0 1189.0 0.0 0.0 1.000000 0.000000 0.0 ... 0.833770 1.000000 0.800503 0.199497 0.889200 0.901954 0.803909 0.816656 0.256808 0.199497
1 P_BAD0 0 0.01 4771.0 1104.0 0.0 85.0 1.000000 0.071489 0.0 ... 0.843798 0.928511 0.814765 0.187915 0.896299 0.901954 0.803909 0.816656 0.256808 0.185235
2 P_BAD0 0 0.02 4771.0 1070.0 0.0 119.0 1.000000 0.100084 0.0 ... 0.847876 0.899916 0.820470 0.183188 0.899171 0.901954 0.803909 0.816656 0.256808 0.179530
3 P_BAD0 0 0.03 4771.0 1042.0 0.0 147.0 1.000000 0.123633 0.0 ... 0.851265 0.876367 0.825168 0.179253 0.901550 0.901954 0.803909 0.816656 0.256808 0.174832
4 P_BAD0 0 0.04 4771.0 1012.0 0.0 177.0 1.000000 0.148865 0.0 ... 0.854926 0.851135 0.830201 0.174996 0.904112 0.901954 0.803909 0.816656 0.256808 0.169799
5 P_BAD0 0 0.05 4771.0 971.0 0.0 218.0 1.000000 0.183347 0.0 ... 0.859981 0.816653 0.837081 0.169105 0.907638 0.901954 0.803909 0.816656 0.256808 0.162919
6 P_BAD0 0 0.06 4771.0 956.0 0.0 233.0 1.000000 0.195963 0.0 ... 0.861845 0.804037 0.839597 0.166929 0.908935 0.901954 0.803909 0.816656 0.256808 0.160403
7 P_BAD0 0 0.07 4771.0 949.0 0.0 240.0 1.000000 0.201850 0.0 ... 0.862717 0.798150 0.840772 0.165909 0.909542 0.901954 0.803909 0.816656 0.256808 0.159228
8 P_BAD0 0 0.08 4771.0 941.0 0.0 248.0 1.000000 0.208579 0.0 ... 0.863717 0.791421 0.842114 0.164741 0.910236 0.901954 0.803909 0.816656 0.256808 0.157886
9 P_BAD0 0 0.09 4771.0 934.0 0.0 255.0 1.000000 0.214466 0.0 ... 0.864594 0.785534 0.843289 0.163716 0.910844 0.901954 0.803909 0.816656 0.256808 0.156711
10 P_BAD0 0 0.10 4771.0 923.0 0.0 266.0 1.000000 0.223717 0.0 ... 0.865975 0.776283 0.845134 0.162100 0.911801 0.901954 0.803909 0.816656 0.256808 0.154866
11 P_BAD0 0 0.11 4771.0 910.0 0.0 279.0 1.000000 0.234651 0.0 ... 0.867612 0.765349 0.847315 0.160183 0.912935 0.901954 0.803909 0.816656 0.256808 0.152685
12 P_BAD0 0 0.12 4770.0 895.0 1.0 294.0 0.999790 0.247267 0.0 ... 0.869454 0.752733 0.849664 0.157988 0.914143 0.901954 0.803909 0.816656 0.256808 0.150336
13 P_BAD0 0 0.13 4766.0 871.0 5.0 318.0 0.998952 0.267452 0.0 ... 0.872287 0.732548 0.853020 0.154515 0.915834 0.901954 0.803909 0.816656 0.256808 0.146980
14 P_BAD0 0 0.14 4766.0 863.0 5.0 326.0 0.998952 0.274180 0.0 ... 0.873310 0.725820 0.854362 0.153313 0.916538 0.901954 0.803909 0.816656 0.256808 0.145638
15 P_BAD0 0 0.15 4762.0 849.0 9.0 340.0 0.998114 0.285955 0.0 ... 0.874885 0.714045 0.856040 0.151310 0.917357 0.901954 0.803909 0.816656 0.256808 0.143960
16 P_BAD0 0 0.16 4761.0 833.0 10.0 356.0 0.997904 0.299411 0.0 ... 0.876892 0.700589 0.858557 0.148910 0.918669 0.901954 0.803909 0.816656 0.256808 0.141443
17 P_BAD0 0 0.17 4761.0 825.0 10.0 364.0 0.997904 0.306140 0.0 ... 0.877927 0.693860 0.859899 0.147691 0.919378 0.901954 0.803909 0.816656 0.256808 0.140101
18 P_BAD0 0 0.18 4761.0 815.0 10.0 374.0 0.997904 0.314550 0.0 ... 0.879224 0.685450 0.861577 0.146162 0.920267 0.901954 0.803909 0.816656 0.256808 0.138423
19 P_BAD0 0 0.19 4761.0 813.0 10.0 376.0 0.997904 0.316232 0.0 ... 0.879484 0.683768 0.861913 0.145856 0.920445 0.901954 0.803909 0.816656 0.256808 0.138087
20 P_BAD0 0 0.20 4760.0 802.0 11.0 387.0 0.997694 0.325484 0.0 ... 0.880862 0.674516 0.863591 0.144193 0.921320 0.901954 0.803909 0.816656 0.256808 0.136409
21 P_BAD0 0 0.21 4760.0 792.0 11.0 397.0 0.997694 0.333894 0.0 ... 0.882168 0.666106 0.865268 0.142651 0.922213 0.901954 0.803909 0.816656 0.256808 0.134732
22 P_BAD0 0 0.22 4759.0 783.0 12.0 406.0 0.997485 0.341463 0.0 ... 0.883292 0.658537 0.866611 0.141285 0.922913 0.901954 0.803909 0.816656 0.256808 0.133389
23 P_BAD0 0 0.23 4758.0 777.0 13.0 412.0 0.997275 0.346510 0.0 ... 0.884025 0.653490 0.867450 0.140379 0.923346 0.901954 0.803909 0.816656 0.256808 0.132550
24 P_BAD0 0 0.24 4756.0 773.0 15.0 416.0 0.996856 0.349874 0.0 ... 0.884442 0.650126 0.867785 0.139808 0.923495 0.901954 0.803909 0.816656 0.256808 0.132215
25 P_BAD0 0 0.25 4746.0 735.0 25.0 454.0 0.994760 0.381833 0.0 ... 0.888931 0.618167 0.872483 0.134100 0.925868 0.901954 0.803909 0.816656 0.256808 0.127517
26 P_BAD0 0 0.26 4741.0 724.0 30.0 465.0 0.993712 0.391085 0.0 ... 0.890128 0.608915 0.873490 0.132479 0.926338 0.901954 0.803909 0.816656 0.256808 0.126510
27 P_BAD0 0 0.27 4734.0 704.0 37.0 485.0 0.992245 0.407906 0.0 ... 0.892433 0.592094 0.875671 0.129459 0.927417 0.901954 0.803909 0.816656 0.256808 0.124329
28 P_BAD0 0 0.28 4728.0 684.0 43.0 505.0 0.990987 0.424727 0.0 ... 0.894811 0.575273 0.878020 0.126386 0.928607 0.901954 0.803909 0.816656 0.256808 0.121980
29 P_BAD0 0 0.29 4725.0 672.0 46.0 517.0 0.990358 0.434819 0.0 ... 0.896278 0.565181 0.879530 0.124514 0.929386 0.901954 0.803909 0.816656 0.256808 0.120470
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
70 P_BAD0 0 0.70 4337.0 294.0 434.0 895.0 0.909034 0.752733 0.0 ... 0.930886 0.247267 0.877852 0.063485 0.922570 0.901954 0.803909 0.816656 0.256808 0.122148
71 P_BAD0 0 0.71 4334.0 293.0 437.0 896.0 0.908405 0.753574 0.0 ... 0.930882 0.246426 0.877517 0.063324 0.922324 0.901954 0.803909 0.816656 0.256808 0.122483
72 P_BAD0 0 0.72 4276.0 275.0 495.0 914.0 0.896248 0.768713 0.0 ... 0.930577 0.231287 0.870805 0.060426 0.917400 0.901954 0.803909 0.816656 0.256808 0.129195
73 P_BAD0 0 0.73 4268.0 271.0 503.0 918.0 0.894571 0.772077 0.0 ... 0.930780 0.227923 0.870134 0.059705 0.916864 0.901954 0.803909 0.816656 0.256808 0.129866
74 P_BAD0 0 0.74 4261.0 270.0 510.0 919.0 0.893104 0.772918 0.0 ... 0.930553 0.227082 0.869128 0.059589 0.916147 0.901954 0.803909 0.816656 0.256808 0.130872
75 P_BAD0 0 0.75 4245.0 265.0 526.0 924.0 0.889751 0.777124 0.0 ... 0.930472 0.222876 0.867282 0.058758 0.914772 0.901954 0.803909 0.816656 0.256808 0.132718
76 P_BAD0 0 0.76 4224.0 260.0 547.0 929.0 0.885349 0.781329 0.0 ... 0.930110 0.218671 0.864597 0.057984 0.912804 0.901954 0.803909 0.816656 0.256808 0.135403
77 P_BAD0 0 0.77 4219.0 258.0 552.0 931.0 0.884301 0.783011 1.0 ... 0.930156 0.216989 0.864094 0.057628 0.912413 0.901954 0.803909 0.816656 0.256808 0.135906
78 P_BAD0 0 0.78 4212.0 257.0 559.0 932.0 0.882834 0.783852 0.0 ... 0.929924 0.216148 0.863087 0.057507 0.911688 0.901954 0.803909 0.816656 0.256808 0.136913
79 P_BAD0 0 0.79 4197.0 254.0 574.0 935.0 0.879690 0.786375 0.0 ... 0.929568 0.213625 0.861074 0.057066 0.910215 0.901954 0.803909 0.816656 0.256808 0.138926
80 P_BAD0 0 0.80 4188.0 251.0 583.0 938.0 0.877803 0.788898 0.0 ... 0.929551 0.211102 0.860067 0.056544 0.909446 0.901954 0.803909 0.816656 0.256808 0.139933
81 P_BAD0 0 0.81 4162.0 248.0 609.0 941.0 0.872354 0.791421 0.0 ... 0.928562 0.208579 0.856208 0.056236 0.906655 0.901954 0.803909 0.816656 0.256808 0.143792
82 P_BAD0 0 0.82 4153.0 247.0 618.0 942.0 0.870467 0.792262 0.0 ... 0.928211 0.207738 0.854866 0.056136 0.905681 0.901954 0.803909 0.816656 0.256808 0.145134
83 P_BAD0 0 0.83 4116.0 238.0 655.0 951.0 0.862712 0.799832 0.0 ... 0.927570 0.200168 0.850168 0.054662 0.902137 0.901954 0.803909 0.816656 0.256808 0.149832
84 P_BAD0 0 0.84 4085.0 235.0 686.0 954.0 0.856215 0.802355 0.0 ... 0.926262 0.197645 0.845470 0.054398 0.898691 0.901954 0.803909 0.816656 0.256808 0.154530
85 P_BAD0 0 0.85 4021.0 221.0 750.0 968.0 0.842800 0.814130 0.0 ... 0.924836 0.185870 0.837081 0.052098 0.892267 0.901954 0.803909 0.816656 0.256808 0.162919
86 P_BAD0 0 0.86 3990.0 215.0 781.0 974.0 0.836303 0.819176 0.0 ... 0.923996 0.180824 0.832886 0.051130 0.889037 0.901954 0.803909 0.816656 0.256808 0.167114
87 P_BAD0 0 0.87 3956.0 211.0 815.0 978.0 0.829176 0.822540 0.0 ... 0.922618 0.177460 0.827852 0.050636 0.885209 0.901954 0.803909 0.816656 0.256808 0.172148
88 P_BAD0 0 0.88 3891.0 202.0 880.0 987.0 0.815552 0.830109 0.0 ... 0.920163 0.169891 0.818456 0.049353 0.877933 0.901954 0.803909 0.816656 0.256808 0.181544
89 P_BAD0 0 0.89 3709.0 182.0 1062.0 1007.0 0.777405 0.846930 0.0 ... 0.911974 0.153070 0.791275 0.046775 0.856384 0.901954 0.803909 0.816656 0.256808 0.208725
90 P_BAD0 0 0.90 3588.0 170.0 1183.0 1019.0 0.752044 0.857023 0.0 ... 0.905923 0.142977 0.772987 0.045237 0.841365 0.901954 0.803909 0.816656 0.256808 0.227013
91 P_BAD0 0 0.91 3320.0 142.0 1451.0 1047.0 0.695871 0.880572 0.0 ... 0.891562 0.119428 0.732718 0.041017 0.806510 0.901954 0.803909 0.816656 0.256808 0.267282
92 P_BAD0 0 0.92 3030.0 115.0 1741.0 1074.0 0.635087 0.903280 0.0 ... 0.873149 0.096720 0.688591 0.036566 0.765538 0.901954 0.803909 0.816656 0.256808 0.311409
93 P_BAD0 0 0.93 2800.0 101.0 1971.0 1088.0 0.586879 0.915055 0.0 ... 0.854962 0.084945 0.652349 0.034816 0.729927 0.901954 0.803909 0.816656 0.256808 0.347651
94 P_BAD0 0 0.94 2157.0 54.0 2614.0 1135.0 0.452106 0.954584 0.0 ... 0.792141 0.045416 0.552349 0.024423 0.617875 0.901954 0.803909 0.816656 0.256808 0.447651
95 P_BAD0 0 0.95 1790.0 37.0 2981.0 1152.0 0.375183 0.968881 0.0 ... 0.740955 0.031119 0.493624 0.020252 0.542589 0.901954 0.803909 0.816656 0.256808 0.506376
96 P_BAD0 0 0.96 1347.0 18.0 3424.0 1171.0 0.282331 0.984861 0.0 ... 0.658293 0.015139 0.422483 0.013187 0.439048 0.901954 0.803909 0.816656 0.256808 0.577517
97 P_BAD0 0 0.97 900.0 5.0 3871.0 1184.0 0.188640 0.995795 0.0 ... 0.536289 0.004205 0.349664 0.005525 0.317125 0.901954 0.803909 0.816656 0.256808 0.650336
98 P_BAD0 0 0.98 792.0 4.0 3979.0 1185.0 0.166003 0.996636 0.0 ... 0.497800 0.003364 0.331711 0.005025 0.284534 0.901954 0.803909 0.816656 0.256808 0.668289
99 P_BAD0 0 0.99 484.0 2.0 4287.0 1187.0 0.101446 0.998318 0.0 ... 0.360387 0.001682 0.280369 0.004115 0.184135 0.901954 0.803909 0.816656 0.256808 0.719631

100 rows × 21 columns


§ FitStat_1_GradBoost
Fit Statistics for _AUTOTUNE_DEFAULT_SCORE_TABLE_
NOBS ASE DIV RASE MCE MCLL
0 5960.0 0.081221 5960.0 0.284993 0.106879 0.276219

§ TunerInfo_1_GradBoost
Tuner Information
Parameter Value
0 Model Type Gradient Boosting Tree
1 Tuner Objective Function Misclassification
2 Search Method GRID
3 Number of Grid Points 16
4 Maximum Tuning Time in Seconds 36000
5 Validation Type Cross-Validation
6 Num Folds in Cross-Validation 5
7 Log Level 0
8 Seed 726656387
9 Number of Parallel Evaluations 4
10 Number of Workers per Subsession 0

§ TunerResults_1_GradBoost
Tuner Results
Evaluation M LEARNINGRATE SUBSAMPLERATE LASSO RIDGE NBINS MAXLEVEL MeanConseqError EvaluationTime
0 0 4 0.10 0.5 0.0 1.0 50 5 0.199497 0.928658
1 2 4 0.10 0.6 0.0 0.0 77 7 0.121962 11.134599
2 3 4 0.10 0.8 0.5 0.0 77 7 0.126618 9.764959
3 7 4 0.10 0.6 0.5 0.0 77 5 0.128040 6.044404
4 9 4 0.10 0.8 0.0 0.0 77 5 0.128141 6.599532
5 11 4 0.10 0.8 0.0 0.0 77 7 0.128396 9.206150
6 5 4 0.10 0.6 0.0 0.0 77 5 0.129550 6.472404
7 8 4 0.10 0.6 0.5 0.0 77 7 0.130851 9.779642
8 15 4 0.10 0.8 0.5 0.0 77 5 0.147987 4.466785
9 6 4 0.05 0.8 0.0 0.0 77 5 0.199362 1.984114
10 10 4 0.05 0.8 0.5 0.0 77 7 0.199385 2.350897

§ IterationHistory_1_GradBoost
Tuner Iteration History
Iteration Evaluations Best_obj Time_sec
0 0 1 0.199497 0.928658
1 1 17 0.121962 21.858145

§ EvaluationHistory_1_GradBoost
Tuner Evaluation History
Evaluation Iteration M LEARNINGRATE SUBSAMPLERATE LASSO RIDGE NBINS MAXLEVEL MeanConseqError EvaluationTime
0 0 0 4 0.10 0.5 0.0 1.0 50 5 0.199497 0.928658
1 1 1 4 0.05 0.8 0.0 0.0 77 7 0.199463 3.104330
2 2 1 4 0.10 0.6 0.0 0.0 77 7 0.121962 11.134599
3 3 1 4 0.10 0.8 0.5 0.0 77 7 0.126618 9.764959
4 4 1 4 0.05 0.6 0.5 0.0 77 7 0.199530 3.689206
5 5 1 4 0.10 0.6 0.0 0.0 77 5 0.129550 6.472404
6 6 1 4 0.05 0.8 0.0 0.0 77 5 0.199362 1.984114
7 7 1 4 0.10 0.6 0.5 0.0 77 5 0.128040 6.044404
8 8 1 4 0.10 0.6 0.5 0.0 77 7 0.130851 9.779642
9 9 1 4 0.10 0.8 0.0 0.0 77 5 0.128141 6.599532
10 10 1 4 0.05 0.8 0.5 0.0 77 7 0.199385 2.350897
11 11 1 4 0.10 0.8 0.0 0.0 77 7 0.128396 9.206150
12 12 1 4 0.05 0.8 0.5 0.0 77 5 0.199609 1.112318
13 13 1 4 0.05 0.6 0.0 0.0 77 7 0.199609 1.428851
14 14 1 4 0.05 0.6 0.0 0.0 77 5 0.199609 1.065779
15 15 1 4 0.10 0.8 0.5 0.0 77 5 0.147987 4.466785
16 16 1 4 0.05 0.6 0.5 0.0 77 5 0.199440 1.177131

§ BestConfiguration_1_GradBoost
Best Configuration
Parameter Name Value
0 Evaluation Evaluation 2
1 Number of Variables to Try M 4
2 Learning Rate LEARNINGRATE 0.1
3 Sampling Rate SUBSAMPLERATE 0.6
4 Lasso LASSO 0
5 Ridge RIDGE 0
6 Number of Bins NBINS 77
7 Maximum Tree Levels MAXLEVEL 7
8 Misclassification Objective 0.121961723

§ TunerSummary_1_GradBoost
Tuner Summary
Parameter Value
0 Initial Configuration Objective Value 0.199497
1 Best Configuration Objective Value 0.121962
2 Worst Configuration Objective Value 0.199609
3 Initial Configuration Evaluation Time in Seconds 0.928658
4 Best Configuration Evaluation Time in Seconds 10.997567
5 Number of Improved Configurations 5.000000
6 Number of Evaluated Configurations 17.000000
7 Total Tuning Time in Seconds 24.530787
8 Parallel Tuning Speedup 3.358467

§ TunerTiming_1_GradBoost
Tuner Task Timing
Task Time_sec Time_percent
0 Model Training 77.405939 93.955399
1 Model Scoring 4.508735 5.472707
2 Total Objective Evaluations 81.922060 99.437071
3 Tuner 0.463774 0.562929
4 Total CPU Time 82.385834 100.000000

§ TunerCasOutputTables_1_GradBoost
Tuner CAS Output Tables
CAS_Library Name Rows Columns
0 CASUSER(SASDEMO) ASTORE_OUT_PY_gradBoost_1 1 2

§ HyperparameterImportance_1_GradBoost
Hyperparameter Importance
Hyperparameter RelImportance
0 LEARNINGRATE 1.000000
1 MAXLEVEL 0.012582
2 SUBSAMPLERATE 0.000550
3 LASSO 0.000193
4 M 0.000000
5 RIDGE 0.000000
6 NBINS 0.000000

§ OutputCasTables
casLib Name Rows Columns casTable
0 CASUSER(sasdemo) PIPELINE_OUT_PY 10 15 CASTable('PIPELINE_OUT_PY', caslib='CASUSER(sa...
1 CASUSER(sasdemo) TRANSFORMATION_OUT_PY 17 21 CASTable('TRANSFORMATION_OUT_PY', caslib='CASU...
2 CASUSER(sasdemo) FEATURE_OUT_PY 23 15 CASTable('FEATURE_OUT_PY', caslib='CASUSER(sas...
3 CASUSER(sasdemo) ASTORE_OUT_PY_fm_ 1 2 CASTable('ASTORE_OUT_PY_fm_', caslib='CASUSER(...
4 CASUSER(sasdemo) ASTORE_OUT_PY_gradBoost_1 1 2 CASTable('ASTORE_OUT_PY_gradBoost_1', caslib='...

elapsed 125s · user 0.729s · sys 0.254s · mem 0.276MB


In [23]:
conn.fetch(table = {"name" : "TRANSFORMATION_OUT_PY"})


Out[23]:
§ Fetch
Selected Rows from Table TRANSFORMATION_OUT_PY
FTGPipelineId Name NVariables IsInteraction ImputeMethod OutlierMethod OutlierTreat OutlierArgs FunctionMethod FunctionArgs ... MapIntervalArgs HashMethod HashArgs DateTimeMethod DiscretizeMethod DiscretizeArgs CatTransMethod CatTransArgs InteractionMethod InteractionSynthesizer
0 1.0 miss_ind 3.0 NaN ... NaN MissIndicator 2.0 NaN Label (Sparse One-Hot) NaN
1 2.0 hc_tar_frq_rat 1.0 NaN ... 10.0 NaN NaN NaN
2 3.0 hc_lbl_cnt 1.0 NaN ... 0.0 NaN NaN NaN
3 4.0 hc_cnt 1.0 NaN ... 0.0 NaN NaN NaN
4 5.0 hc_cnt_log 1.0 NaN Log e ... 0.0 NaN NaN NaN
5 6.0 lcnhenhi_grp_rare 2.0 NaN ... NaN NaN NaN Group Rare 5.0
6 7.0 lcnhenhi_dtree5 2.0 NaN ... NaN NaN NaN DTree 5.0
7 8.0 lcnhenhi_dtree10 2.0 NaN ... NaN NaN NaN DTree 10.0
8 9.0 hk_yj_n2 1.0 Median NaN Yeo-Johnson -2 ... NaN NaN NaN NaN
9 10.0 hk_yj_n1 1.0 Median NaN Yeo-Johnson -1 ... NaN NaN NaN NaN
10 11.0 hk_yj_0 1.0 Median NaN Yeo-Johnson 0 ... NaN NaN NaN NaN
11 12.0 hk_yj_p1 1.0 Median NaN Yeo-Johnson 1 ... NaN NaN NaN NaN
12 13.0 hk_yj_p2 1.0 Median NaN Yeo-Johnson 2 ... NaN NaN NaN NaN
13 14.0 hk_dtree_disct5 1.0 NaN ... NaN NaN DTree 5.0 NaN
14 15.0 hk_dtree_disct10 1.0 NaN ... NaN NaN DTree 10.0 NaN
15 16.0 cpy_int_med_imp 1.0 Median NaN ... NaN NaN NaN NaN
16 17.0 cpy_nom_miss_lev_lab 2.0 NaN ... NaN NaN NaN Label (Sparse One-Hot) 0.0

17 rows × 21 columns

elapsed 0.00252s · user 0.00238s · sys 7.7e-05s · mem 0.968MB


In [24]:
conn.fetch(table = {"name" : "FEATURE_OUT_PY"})


Out[24]:
§ Fetch
Selected Rows from Table FEATURE_OUT_PY
FeatureId Name IsNominal FTGPipelineId NInputs InputVar1 InputVar2 InputVar3 Label RankCrit BestTransRank GlobalIntervalRank GlobalNominalRank GlobalRank IsGenerated
0 1.0 cpy_int_med_imp_DEBTINC 0.0 16.0 1.0 DEBTINC DEBTINC: Low missing rate - median imputation 0.086483 1.0 1.0 NaN 4.0 1.0
1 2.0 hk_dtree_disct10_DEBTINC 1.0 15.0 1.0 DEBTINC DEBTINC: High kurtosis - ten bin decision tree... 0.102374 3.0 NaN 3.0 3.0 0.0
2 3.0 hk_dtree_disct5_DEBTINC 1.0 14.0 1.0 DEBTINC DEBTINC: High kurtosis - five bin decision tre... 0.129960 2.0 NaN 2.0 2.0 1.0
3 4.0 hk_yj_0_DEBTINC 0.0 11.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=0)... 0.080955 3.0 3.0 NaN 6.0 0.0
4 5.0 hk_yj_n1_DEBTINC 0.0 10.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=-1... 0.060571 4.0 4.0 NaN 9.0 0.0
5 6.0 hk_yj_n2_DEBTINC 0.0 9.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=-2... 0.007162 6.0 10.0 NaN 17.0 0.0
6 7.0 hk_yj_p1_DEBTINC 0.0 12.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=1)... 0.086483 1.0 1.0 NaN 4.0 1.0
7 8.0 hk_yj_p2_DEBTINC 0.0 13.0 1.0 DEBTINC DEBTINC: High kurtosis - Yeo-Johnson(lambda=2)... 0.044039 5.0 5.0 NaN 12.0 0.0
8 9.0 miss_ind_DEBTINC 1.0 1.0 1.0 DEBTINC DEBTINC: Significant missing - missing indicator 0.251610 1.0 NaN 1.0 1.0 1.0
9 10.0 cpy_nom_miss_lev_lab_DELINQ 1.0 17.0 1.0 DELINQ DELINQ: Low missing rate - missing level 0.068430 1.0 NaN 4.0 7.0 1.0
10 11.0 lcnhenhi_dtree10_DELINQ 1.0 8.0 1.0 DELINQ DELINQ: Low cardinality, not high (entropy, IQ... 0.000000 4.0 NaN 10.0 20.0 0.0
11 12.0 lcnhenhi_dtree5_DELINQ 1.0 7.0 1.0 DELINQ DELINQ: Low cardinality, not high (entropy, IQ... 0.000000 4.0 NaN 10.0 20.0 0.0
12 13.0 lcnhenhi_grp_rare_DELINQ 1.0 6.0 1.0 DELINQ DELINQ: Low cardinality, not high (entropy, IQ... 0.068430 1.0 NaN 4.0 7.0 1.0
13 14.0 miss_ind_DELINQ 1.0 1.0 1.0 DELINQ DELINQ: Significant missing - missing indicator 0.005183 3.0 NaN 9.0 19.0 0.0
14 15.0 cpy_nom_miss_lev_lab_DEROG 1.0 17.0 1.0 DEROG DEROG: Low missing rate - missing level 0.051227 1.0 NaN 6.0 10.0 1.0
15 16.0 lcnhenhi_dtree10_DEROG 1.0 8.0 1.0 DEROG DEROG: Low cardinality, not high (entropy, IQV... 0.000000 4.0 NaN 10.0 20.0 0.0
16 17.0 lcnhenhi_dtree5_DEROG 1.0 7.0 1.0 DEROG DEROG: Low cardinality, not high (entropy, IQV... 0.000000 4.0 NaN 10.0 20.0 0.0
17 18.0 lcnhenhi_grp_rare_DEROG 1.0 6.0 1.0 DEROG DEROG: Low cardinality, not high (entropy, IQV... 0.051227 1.0 NaN 6.0 10.0 1.0
18 19.0 miss_ind_DEROG 1.0 1.0 1.0 DEROG DEROG: Significant missing - missing indicator 0.006342 3.0 NaN 8.0 18.0 0.0
19 20.0 hc_cnt_LOAN 0.0 4.0 1.0 LOAN LOAN: High cardinality - count encoding 0.015641 2.0 7.0 NaN 14.0 1.0

elapsed 0.00216s · user 0.00105s · sys 0.00101s · mem 0.968MB


In [25]:
conn.fetch(table = {"name" : "PIPELINE_OUT_PY"})


Out[25]:
§ Fetch
Selected Rows from Table PIPELINE_OUT_PY
PipelineId ModelType MLType Objective ObjectiveType Target NFeatures Feat1Id Feat1IsNom Feat2Id Feat2IsNom Feat3Id Feat3IsNom Feat4Id Feat4IsNom
0 2.0 binary classification gradBoost 0.114747 MCE BAD 4.0 10.0 1.0 15.0 1.0 9.0 1.0 23.0 0.0
1 9.0 binary classification dtree 0.115100 MCE BAD 4.0 13.0 1.0 18.0 1.0 3.0 1.0 23.0 0.0
2 10.0 binary classification gradBoost 0.121141 MCE BAD 4.0 13.0 1.0 18.0 1.0 3.0 1.0 23.0 0.0
3 1.0 binary classification dtree 0.121455 MCE BAD 4.0 10.0 1.0 15.0 1.0 9.0 1.0 23.0 0.0
4 3.0 binary classification dtree 0.126139 MCE BAD 3.0 13.0 1.0 15.0 1.0 3.0 1.0 NaN NaN
5 4.0 binary classification gradBoost 0.127818 MCE BAD 3.0 13.0 1.0 15.0 1.0 3.0 1.0 NaN NaN
6 8.0 binary classification gradBoost 0.132595 MCE BAD 3.0 13.0 1.0 15.0 1.0 9.0 1.0 NaN NaN
7 7.0 binary classification dtree 0.133389 MCE BAD 3.0 13.0 1.0 15.0 1.0 9.0 1.0 NaN NaN
8 5.0 binary classification dtree 0.180872 MCE BAD 1.0 10.0 1.0 NaN NaN NaN NaN NaN NaN
9 6.0 binary classification gradBoost 0.185434 MCE BAD 1.0 10.0 1.0 NaN NaN NaN NaN NaN NaN

elapsed 0.00192s · user 0.00174s · sys 5.2e-05s · mem 0.968MB


Conclusion

The dataSciencePilot action set consists of actions that implement a policy-based, configurable, and scalable approach to automating data science workflows. This action set can be used to automate and end-to-end workflow or to automate steps in the workflow such as data preparation, feature preprocessing, feature engineering, feature selection, and hyperparameter tuning. In this notebook, we demonstrated how to use each step of the dataSciencePilot Action set using a Python interface.


In [26]:
conn.close()